Building Convergence – A Journey from Network Observability to AI-Driven Automation Part 11: Teaching an AI Agent to Set Up Golden Config: Design References, Jinja Templates, and the Conversation That Builds Your Network Standard
Nautobot is a source of truth. It knows what devices you have, what interfaces they have, what VLANs are assigned, what IP addresses are allocated. But it doesn’t know what your configs should look like. It knows the facts — it doesn’t know the standard.
That’s what the Golden Config plugin solves. It adds three capabilities to Nautobot: backup (store the actual running config from each device in git), intended (render what the config should look like from Jinja templates fed by Nautobot’s SoT data), and compliance (compare actual vs intended per config section and report what’s different). It turns Nautobot from a system that records what your network is into a system that continuously verifies what your network should be.
The plugin has been installed on my instance since February. It’s been sitting there for two months with a default settings page, no templates, no compliance rules, and no git repository. Every time I looked at it I thought “I should set that up” and then didn’t, because setting it up is a full day of work that requires making a dozen design decisions I didn’t feel like making on a Tuesday night.
Here’s what golden config needs before it does anything useful:
- A git repository with Jinja templates that define what your config should look like
- A SoT aggregation GraphQL query that feeds device data into those templates
- Compliance features and rules that tell it which sections of the config to check
- Config contexts in Nautobot that store the operational parameters your templates reference (NTP servers, SNMP communities, logging hosts — things that aren’t modeled as first-class Nautobot objects)
- Path templates that tell it where to find templates and where to store backups
- All of that wired together in the Golden Config Setting
Each of those steps involves design decisions. Should I use TACACS+ or local auth? Which NTP servers? SNMPv2c or v3? Should unused ports be shut down? What STP priority should the root bridge have? Do I want BPDU guard on access ports?
These aren’t hard questions individually. But there are dozens of them, they interact with each other, and the answers should be informed by Cisco design guides and relevant RFCs — not just “whatever’s in the running config right now.”
So I’m building a system where the AI agent makes the recommendations, I make the decisions, and the agent does the work.
The Problem: Golden Config Is an Architecture Project
Most people think of golden config as “templatize your running config.” It’s not. It’s “define what your network standard should be, express it as code, and continuously verify that reality matches the standard.”
The running config is the starting point, not the answer. My HomeSwitch01 has no aaa new-model — that’s not a standard, that’s a gap. It has snmp-server community public RO — that’s not a standard, that’s a security finding. It has exec-timeout 0 0 on the VTY lines — convenient for a lab, terrible for anything else.
The golden config template needs to represent what the config should be, not what it is. And “should” requires referencing actual design guidance:
- NTP: RFC 5905 says use authentication. Cisco’s campus design guide says at least two servers, sourced from the management interface.
- SNMP: RFC 3414 defines SNMPv3 USM. Cisco says never use ‘public’ or ‘private’ as community strings, restrict with ACLs.
- SSH: RFC 4253 defines the transport layer. Cisco says version 2 only, 2048-bit RSA minimum.
- Spanning Tree: IEEE 802.1w defines RSTP. Cisco says BPDU guard on all access ports, root guard on uplinks.
- VTY Lines: Cisco hardening guide says exec-timeout 15 minutes, SSH only, ACL restricting source addresses.
An AI agent can look up all of this, compare it against the running config, and present recommendations. But it needs the knowledge base to reference.
The Cisco Design Reference
There’s no Cisco design guide MCP server. Cisco’s Design Zone, SAFE architecture docs, and Validated Designs are PDFs and web pages — not APIs. The DevNet Content Search MCP that’s in NetClaw covers API documentation for Meraki and Catalyst Center, not switch hardening guides.
So I built a lightweight knowledge base as a Python module inside the Nautobot MCP server. It covers 13 configuration features, each with:
- Best practices — what Cisco recommends, distilled from their campus design guides and hardening documentation
- Config example — an IOS-XE config snippet showing the recommendation with template variables
- Rationale — why this matters, in plain language
- RFC reference — the relevant standard, if one exists
- Match config — the regex pattern for golden config compliance rules
- Severity — how critical this is (critical, high, medium)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
DESIGN_REFERENCE = {
"ntp": {
"title": "Network Time Protocol",
"best_practices": [
"Configure at least 2 NTP servers for redundancy",
"Use NTP authentication (MD5 or SHA) to prevent time poisoning",
"Restrict NTP access with access-group",
"Source NTP from management interface/VRF",
],
"config_example": """\
ntp authentication-key 1 md5 {ntp_auth_key}
ntp trusted-key 1
ntp authenticate
ntp source {ntp_source_interface}
ntp server vrf {mgmt_vrf} {ntp_server_1} key 1 prefer
ntp server vrf {mgmt_vrf} {ntp_server_2} key 1""",
"rationale": "Accurate time is critical for log correlation, certificate validation, "
"and troubleshooting. NTP authentication prevents man-in-the-middle attacks.",
"rfc": "RFC 5905 (NTPv4)",
"match_config": "^ntp ",
"severity": "high",
},
# ... 12 more features
}
The agent queries this with cisco_design_reference(feature="ntp") and gets back everything it needs to make a recommendation. Combined with the RFC MCP for deep-diving into the actual standards, it has enough context to say: “Your running config has no NTP configured. RFC 5905 recommends authenticated time synchronization. Cisco’s campus design guide recommends at least two servers sourced from the management interface. Here’s what I’d add — do you want NTP authentication enabled, or is unauthenticated acceptable for your lab?”
That’s the conversation. The agent recommends, the user decides, the agent builds.
The Template Architecture
Golden config templates aren’t a single flat file. They’re a hierarchy — a main template that includes section templates, and section templates that include sub-templates for specific interface types. This is the same pattern used in Cisco’s own Nautobot golden config workshop examples.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
cisco_xe/
├── cisco_xe.j2 # Entry point — dispatches by device role
├── platform_templates/
│ └── home_switch.j2 # Full config skeleton with {% include %} calls
├── services/
│ └── vrf.j2 # Management VRF definition
├── management/
│ ├── aaa.j2 # AAA (local or TACACS+)
│ ├── ntp.j2 # NTP servers and authentication
│ ├── logging.j2 # Syslog host and trap level
│ ├── snmp.j2 # Communities, traps, host
│ ├── ssh.j2 # SSH version, source interface
│ ├── http.j2 # HTTP/HTTPS server
│ ├── users.j2 # Local accounts
│ ├── routing.j2 # Static routes
│ └── line_vty.j2 # VTY and console config
├── switching/
│ ├── vlans.j2 # VLAN database from Nautobot SoT
│ ├── spanning_tree.j2 # STP mode, priority
│ └── vtp.j2 # VTP mode
└── interfaces/
├── interfaces.j2 # Dispatcher — routes to sub-templates by type
├── _l2_access.j2 # Access ports (with BPDU guard)
├── _l2_trunk.j2 # Trunk ports (pruned VLANs)
├── _lag_member.j2 # Port-channel members
├── _port_channel.j2 # Port-channel logical interfaces
├── _svi.j2 # VLAN interfaces
└── _unused.j2 # Shutdown ports
The entry point dispatches by device role:
1
2
3
{% if role["name"] == "home_switch" %}
{% include '/cisco_xe/platform_templates/home_switch.j2' %}
{% endif %}
The platform template is the full config skeleton. It {% include %}s each section:
1
2
3
4
5
6
7
8
9
hostname {{ hostname }}
!
{% include '/cisco_xe/services/vrf.j2' %}
!
{% include '/cisco_xe/management/logging.j2' %}
!
{% include '/cisco_xe/management/aaa.j2' %}
!
{% include '/cisco_xe/management/ntp.j2' %}
Each section template reads from config_context — the operational parameters stored in Nautobot:
1
2
3
4
5
6
7
8
9
10
11
12
13
{% if config_context.ntp is defined %}
{% set ntp = config_context.ntp %}
clock timezone {{ ntp.timezone | default("UTC") }} {{ ntp.utc_offset | default("0 0") }}
!
{% if ntp.authenticate | default(false) %}
ntp authentication-key 1 md5 {{ ntp.auth_key }}
ntp trusted-key 1
ntp authenticate
{% endif %}
{% for server in ntp.servers | default([]) %}
ntp server{% if ntp.vrf is defined %} vrf {{ ntp.vrf }}{% endif %} {{ server.address }}{% if server.prefer | default(false) %} prefer{% endif %}
{% endfor %}
{% endif %}
The interface dispatcher routes each interface to the right sub-template based on its mode and type:
1
2
3
4
5
6
7
8
9
10
11
{% for interface in interfaces | sort(attribute='name') %}
{% if 'Vlan' in interface.name %}
{% include '/cisco_xe/interfaces/_svi.j2' %}
{% elif interface.lag is defined and interface.lag %}
{% include '/cisco_xe/interfaces/_lag_member.j2' %}
{% elif interface.mode == 'TAGGED' %}
{% include '/cisco_xe/interfaces/_l2_trunk.j2' %}
{% elif interface.mode == 'ACCESS' %}
{% include '/cisco_xe/interfaces/_l2_access.j2' %}
{% endif %}
{% endfor %}
This structure means you can modify the NTP standard without touching interfaces. You can add a new interface type without touching logging. Each section is independently testable and independently reviewable during the bootstrap conversation.
Config Context: Where the Parameters Live
Nautobot models devices, interfaces, VLANs, and IP addresses as first-class objects. But it doesn’t model NTP servers, SNMP communities, or logging hosts. That’s what config context is for — a JSON blob attached to a device (or role, or location) that stores operational parameters.
The config context for our home switches:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
{
"ntp": {
"timezone": "MST",
"utc_offset": "-7 0",
"source_interface": "Vlan3",
"vrf": "MGMT",
"servers": [
{"address": "time.google.com", "prefer": true},
{"address": "time.cloudflare.com", "prefer": false}
]
},
"logging": {
"trap_level": "debugging",
"source_interface": "Vlan3",
"source_vrf": "MGMT",
"host": "192.168.3.254"
},
"snmp": {
"communities": [{"name": "public", "permission": "RO"}],
"location": "Byrns Lab",
"host": "192.168.3.254",
"traps": ["snmp authentication linkdown linkup coldstart warmstart", "config", "syslog"]
},
"spanning_tree": {"mode": "rapid-pvst", "logging": true},
"vtp": {"mode": "transparent"},
"ssh": {"version": 2, "source_interface": "Vlan3"}
}
This is scoped to the home_switch role in Nautobot, so both HomeSwitch01 and HomeSwitch02 inherit it automatically. When the golden config plugin renders the NTP template, it reads config_context.ntp.servers and generates the NTP server lines. Change the NTP server in the config context, re-render, and the intended config updates for every switch in that role.
That’s the power of golden config — the standard is defined once, in Nautobot, and applied everywhere through templates.
The Bootstrap Conversation
Here’s what the workflow actually looks like. The agent doesn’t just dump templates and walk away. It’s a conversation:
Agent: “I’ve collected the running config from HomeSwitch01. I see 12 logical config sections. Let me check each one against Cisco best practices.”
Agent: “Your NTP section is empty — no NTP servers configured. RFC 5905 recommends authenticated time synchronization. Cisco’s campus design guide recommends at least two servers sourced from the management interface. I’d suggest adding time.google.com and time.cloudflare.com via the MGMT VRF. Do you want NTP authentication enabled?”
User: “No auth for now, it’s a home lab. But yes, add those two servers.”
Agent: “Got it. I’ll add NTP without authentication. Here’s the template I’ll use:” (shows ntp.j2) “And here’s the config context data:” (shows the NTP section of the JSON)
Agent: “Your SNMP config uses public as the read-only community. RFC 3414 recommends SNMPv3 with authentication and encryption. At minimum, Cisco recommends non-default community strings with ACL restrictions. Want me to keep public for lab simplicity, or change it?”
User: “Keep it for now but add an ACL.”
Agent: “Done. I’ll add access-list 99 permit 192.168.3.0 0.0.0.255 and apply it to the community. Here’s the updated SNMP template…”
This continues through all 12 sections. The user makes the design decisions. The agent provides the context, builds the templates, and handles all the Nautobot API work.
At the end, the agent:
- Creates a private GitHub repo with the template directory structure
- Commits all the templates
- Registers the repo in Nautobot
- Creates the compliance features and rules
- Wires the Golden Config Setting
- Runs the first compliance check
The user goes from “golden config is installed but unconfigured” to “golden config is running compliance checks against a defined standard” in one conversation.
Why This Matters
Golden config is the most powerful feature in Nautobot and the least used. Not because it’s bad — because the setup cost is too high. You need to understand Jinja templating, Nautobot’s GraphQL schema, git repository integration, config context schemas, compliance rule regex patterns, and the relationship between all of them. That’s a lot of domain knowledge for a network engineer who just wants to know if their switches match the standard.
The bootstrap workflow reduces that to a conversation. The agent has the domain knowledge — the design reference, the RFC lookups, the template patterns, the Nautobot API. The user has the context — their network, their requirements, their risk tolerance. The conversation produces a working golden config setup that reflects both.
The templates are starting points, not final products. The design reference is guidance, not gospel. The compliance rules are initial patterns that get refined as you run compliance checks and discover what matches and what doesn’t. The whole system is designed to iterate.
And because everything is in git — the templates, the config context, the compliance rules — you have a full audit trail of how your network standard evolved. That’s not just good engineering. That’s how you answer “why is the config this way?” six months from now.
What’s Built
| Component | What | Where |
|---|---|---|
| Nautobot MCP v2 | 32 tools (reads, writes, plugins, reconciliation, design reference, template scaffolding) | mcp-servers/nautobot-mcp-v2/ |
| Cisco Design Reference | 13 features with best practices, config examples, rationale, RFCs | cisco_design_reference.py |
| Template Scaffolding | 22 Jinja templates in hierarchical structure | references/templates/cisco_xe/ |
| Bootstrap Skill | 8-phase conversational workflow | workspace/skills/golden-config-bootstrap/SKILL.md |
| Spec 027 | Full spec for nautobot-mcp-v2 (8 files, 1,249 lines) | specs/027-nautobot-mcp-v2/ |
| Spec 028 | Golden config bootstrap workflow spec | specs/028-golden-config-bootstrap/ |
What’s In Nautobot (from testing)
| Object | Count | Status |
|---|---|---|
| Config context schema | 1 | “Home Switch Config Context” |
| Config context | 1 | “Home Switch Standards” — 12 keys, scoped to home_switch role |
| Compliance features | 12 | services, aaa, ntp, logging, snmp, ssh, vtp, spanning_tree, vlans, interfaces, line_vty, management_plane |
| Compliance rules | 12 | One per feature, linked to cisco_ios platform with match_config patterns |
| SoT aggregation query | 1 | “GC SoT Aggregation - Home Switches” |
| Golden Config Setting | 1 | Path templates set, SoT query linked, repos pending |
What’s Next
- Run the bootstrap conversation through NetClaw with the skill loaded
- Create the private GitHub repo and commit templates via GitHub MCP
- Register the repo in Nautobot and link to Golden Config Setting
- Run the first backup → intended → compliance cycle
- Iterate on templates based on compliance results
- Write the pfSense MCP server (spec 028-pfsense-mcp)
The golden config bootstrap skill is at workspace/skills/golden-config-bootstrap/SKILL.md. The design reference is at mcp-servers/nautobot-mcp-v2/cisco_design_reference.py. Template scaffolding is at mcp-servers/nautobot-mcp-v2/references/templates/cisco_xe/. All 32 tools are tested against a live Nautobot 3.1.0 instance with Golden Config 3.0.5.
The Convergence platform lives at byrn-baker/netclaw-convergence on the main branch.
