Ansible for Network Automation
Ansible earned its place in network automation for one reason: it's agentless. Your switches and routers can't run an agent, but they can all accept an SSH (or NETCONF/API) connection — which is exactly how Ansible talks to them. If you can type ssh admin@switch, you can automate it.
This article covers the network-specific side of Ansible: how connections differ from server automation, inventory design, the modules that matter for Junos and IOS, and the patterns that keep a config push from becoming an outage.
How Network Ansible Differs
On servers, Ansible copies a Python module to the host and runs it there. Network devices can't do that, so network modules run on the control node and talk to the device over:
ansible.netcommon.network_cli— SSH to the CLI (IOS, EOS, NX-OS…)ansible.netcommon.netconf— NETCONF over SSH (the right choice for Junos)httpapi— REST APIs (NX-OS, some platforms)
Vendor content ships as collections: junipernetworks.junos, cisco.ios, cisco.meraki, arista.eos, plus ansible.netcommon and ansible.utils for the glue (and cisco.meraki is API-based — no SSH at all).
ansible-galaxy collection install junipernetworks.junos cisco.ios ansible.netcommon
Inventory Is the Foundation
Group by platform and role; put connection details at the group level and secrets in Vault:
# inventory/network.yml
all:
children:
junos_switches:
hosts:
access-sw-01: { ansible_host: 10.10.1.11 }
access-sw-02: { ansible_host: 10.10.1.12 }
vars:
ansible_connection: ansible.netcommon.netconf
ansible_network_os: junipernetworks.junos.junos
ansible_user: automation
ios_routers:
hosts:
wan-rtr-01: { ansible_host: 10.10.0.1 }
vars:
ansible_connection: ansible.netcommon.network_cli
ansible_network_os: cisco.ios.ios
Better yet, don't hand-maintain inventory at all — dynamic inventory plugins can pull it from NetBox (or the Meraki/Mist APIs), so your source of truth stays your source of truth.
Facts First: The Read-Only Win
The fastest way to demonstrate value is a playbook that changes nothing:
- name: Audit Junos fleet
hosts: junos_switches
gather_facts: no
tasks:
- name: Collect facts
junipernetworks.junos.junos_facts:
gather_subset: hardware
- name: Report version drift
ansible.builtin.debug:
msg: "{{ inventory_hostname }} runs {{ ansible_net_version }}"
Ten minutes of work and you have a fleet-wide software/serial/uptime report. Compliance checks (is NTP right? is the SNMPv3 user present? does every uplink have storm control?) are the same pattern: gather, assert, report. No change window required.
Config Changes: Templates + Idempotent Modules
The core pattern: variables + Jinja2 template + config module.
{# templates/junos-vlans.j2 #}
vlans {
{% for vlan in vlans %}
{{ vlan.name }} {
vlan-id {{ vlan.id }};
}
{% endfor %}
}
- name: Deploy VLANs to access switches
hosts: junos_switches
gather_facts: no
vars_files: [vars/vlans.yml]
tasks:
- name: Push VLAN config
junipernetworks.junos.junos_config:
src: templates/junos-vlans.j2
update: merge
comment: "ansible: vlan sync {{ lookup('pipe', 'git rev-parse --short HEAD') }}"
register: result
- name: Show what changed
ansible.builtin.debug:
var: result.diff
when: result.changed
Why Junos + NETCONF is the happy path: candidate configuration, diff, and confirmed commit are native. junos_config supports confirm: 5 — if the playbook doesn't confirm within five minutes (say, because you just cut off your own management access), the switch rolls back by itself. Cisco IOS has no equivalent safety net, which is why IOS playbooks should lean harder on --check --diff before the real run:
ansible-playbook vlans.yml --check --diff --limit access-sw-01
Check mode, diff mode, and --limit to one canary device — that trio is the network engineer's staging environment.
Resource Modules
Modern collections include resource modules (junos_vlans, ios_interfaces, junos_bgp_global, …) that model config sections as structured data with a state: argument (merged, replaced, overridden, deleted). The gem is state: gathered — point it at a brownfield device and it returns the existing config as structured data you can save as your vars. That's your migration path from "the config is whatever's on the box" to "the config is in git."
- name: Read existing VLANs into structured data
junipernetworks.junos.junos_vlans:
state: gathered
register: current
Orchestration: The Thing Terraform Can't Do
Declarative tools describe end state; Ansible shines at sequences. A software upgrade is inherently procedural:
- name: Upgrade access switches, one at a time
hosts: junos_switches
serial: 1 # <- one device at a time
tasks:
- name: Pre-check — uplinks and neighbors healthy
...
- name: Install package
junipernetworks.junos.junos_package:
src: "{{ image }}"
reboot: true
- name: Wait for device to return
ansible.builtin.wait_for_connection:
timeout: 900
- name: Post-check — compare neighbor table to pre-check
...
- name: Abort the whole run if post-checks fail
ansible.builtin.meta: end_play
when: postcheck.failed
serial: 1, pre/post validation, and stop-on-failure turn a weekend maintenance marathon into a supervised background job. This is also where Ansible complements Terraform rather than competing with it: Terraform owns what exists (VPCs, Meraki networks, Mist sites — see my Terraform article), Ansible owns device state and change choreography.
Production Hygiene
- Vault your secrets.
ansible-vaultfor device credentials; better, pull from a secrets manager at runtime. No plaintext passwords in git, ever. - Run from CI, not laptops. Playbook runs triggered by PR merge, with logs archived. AWX/AAC (or plain GitLab CI) gives you RBAC and an audit trail.
- Idempotence is a test. Run every playbook twice in dev; the second run must report zero changes. If it doesn't, the playbook is lying about something.
- Config backups as a scheduled playbook.
junos_config: backup: yesnightly into a git repo gives you diffable device history for free — often the first thing that pays off. - Start read-only, earn trust, then write. Audits → backups → canaried standard changes → orchestrated upgrades. Each stage builds the credibility (and the safety tooling) for the next.
The endgame isn't "Ansible manages everything." It's that the CLI session becomes what the console cable already is: a break-glass tool you're mildly embarrassed to need.