There’s an open secret on the planet of DevOps: No one trusts the CMDB. The Configuration Administration Database (CMDB) is meant to be the “supply of fact”—the central map of each server, service, and utility in your enterprise. In idea, it’s the inspiration for safety audits, value evaluation, and incident response. In observe, it’s a piece of fiction. The second you populate a CMDB, it begins to rot. Engineers deploy a brand new microservice however neglect to register it. An autoscaling group spins up 20 new nodes, however the database solely information the unique three. . .
We name this configuration drift, and for many years, our trade’s answer has been to throw extra scripts on the drawback. We write large, brittle ETL (Extract-Rework-Load) pipelines that try to scrape the world and shove it right into a relational database. It by no means works. The “world”—particularly the fashionable cloud native world—strikes too quick.
We realized we couldn’t resolve this drawback by writing higher scripts. We needed to change the elemental structure of how we sync knowledge. We stopped attempting to boil the ocean and repair the complete enterprise directly. As a substitute, we centered on one notoriously troublesome atmosphere: Kubernetes. If we might construct an autonomous agent able to reasoning concerning the complicated, ephemeral state of a Kubernetes cluster, we might show a sample that works all over the place else. This text explores how we used the newly open-sourced Codex CLIand theMannequin Context Protocol (MCP) to construct that agent. Within the course of, we moved from passive code technology to energetic infrastructure operation, remodeling the “stale CMDB” drawback from a knowledge entry job right into a logic puzzle.
The Shift: From Code Era to Infrastructure Operation with Codex CLI and MCP
The explanation most CMDB initiatives fail is ambition. They attempt to monitor each swap port, digital machine, and SaaS license concurrently. The result’s a knowledge swamp—an excessive amount of noise, not sufficient sign. We took a special method. We drew a small circle round a selected area: Kubernetes workloads. Kubernetes is the right testing floor for AI brokers as a result of it’s high-velocity and declarative. Issues change continuously. Pods die; deployments roll over; providers change selectors. A static script struggles to tell apart between a CrashLoopBackOff (a short lived error state) and a purposeful scale-down. We hypothesized that a big language mannequin (LLM), performing as an operator, might perceive this nuance. It wouldn’t simply copy knowledge; it could interpret it.
The Codex CLI turned this speculation right into a tangible structure by enabling a shift from “code technology” to “infrastructure operation.” As a substitute of treating the LLM as a junior programmer that writes scripts for people to evaluate and run, Codex empowers the mannequin to execute code itself. We offer it with instruments—executable capabilities that act as its arms and eyes—by way of the Mannequin Context Protocol. MCP defines a transparent interface between the AI mannequin and the surface world, permitting us to show high-level capabilities like cmdb_stage_transaction with out instructing the mannequin the complicated inner API of our CMDB. The mannequin learns to make use of the instrument, not the underlying API.
The structure of company
Our system, which we name k8s-agent, consists of three distinct layers. This isn’t a single script operating prime to backside; it’s a cognitive structure.
The cognitive layer (Codex + contextual directions): That is the Codex CLI operating a selected system immediate. We don’t fine-tune the mannequin weights. Infrastructure strikes too quick for fine-tuning: A mannequin educated on Kubernetes v1.25 can be hallucinating by v1.30. As a substitute, we use context engineering—the artwork of designing the atmosphere by which the AI operates. This includes instrument design (creating atomic, deterministic capabilities), immediate structure (structuring the system immediate), and data structure (deciding what data to cover or expose). We feed the mannequin a persistent context file (AGENTS.md) that defines its persona: “You’re a meticulous infrastructure auditor. Your purpose is to make sure the CMDB precisely displays the state of the Kubernetes cluster. You have to prioritize security: Don’t delete information except you might have constructive affirmation; they’re orphans.”
The instrument layer: Utilizing MCP, we expose deterministic Python capabilities to the agent.
- Sensors: k8s_list_workloads, cmdb_query_service, k8s_get_deployment_spec
- Actuators: cmdb_stage_create, cmdb_stage_update, cmdb_stage_delete
Be aware that we monitor workloads (Deployments, StatefulSets), not Pods. Pods are ephemeral; monitoring them in a CMDB is an antipattern that creates noise. The agent understands this distinction—a semantic rule that’s onerous to implement in a inflexible script.
The state layer (the protection web): LLMs are probabilistic; infrastructure have to be deterministic. We bridge this hole with a staging sample. The agent by no means writes on to the manufacturing database. It writes to a staged diff. This permits a human (or a coverage engine) to evaluate the proposed adjustments earlier than they’re dedicated.
The OODA Loop in Motion
How does this differ from a normal sync script? A script follows a linear path: Join → Fetch → Write. If any step fails or returns sudden knowledge, the script crashes or corrupts knowledge. Our agent follows the Observe-Orient-Resolve-Act (OODA) loop, popularized by navy strategists. In contrast to a linear script that executes blindly, the OODA loop forces the agent to pause and synthesize data earlier than taking motion. This cycle permits it to deal with incomplete knowledge, confirm assumptions, and adapt to altering situations—traits important for working in a distributed system.
Let’s stroll by means of an actual situation we encountered throughout our pilot, the Ghost Deployment, to discover the advantages of utilizing an OODA loop. A developer had deleted a deployment named payment-processor-v1 from the cluster however forgot to take away the report from the CMDB. An ordinary script would possibly pull the checklist of deployments, see payment-processor-v1 is lacking, and instantly situation a DELETE to the database. The danger is apparent: What if the API server was simply timing out? What if the script had a bug in its pagination logic? The script blindly destroys knowledge primarily based on the absence of proof.
The agent method is essentially completely different. First, it observes: Calling k8s_list_workloads and cmdb_query_service, noticing the discrepancy. Second, it orients: Checking its context directions to “confirm orphans earlier than deletion” and deciding to name k8s_get_event_history. Third, it decides: Seeing a “delete” occasion within the logs, it causes that the useful resource is lacking and that there’s been a deletion occasion. Lastly, it acts: Calling cmdb_stage_delete with a remark confirming the deletion. The agent didn’t simply sync knowledge; it investigated. It dealt with the paradox that normally breaks automation.
Fixing the “Semantic Hole”
This particular Kubernetes use case highlights a broader drawback in IT operations: the “semantic hole.” The information in our infrastructure (JSON, YAML, logs) is stuffed with implicit that means. A label “env: manufacturing” adjustments the criticality of a useful resource. A standing CrashLoopBackOff means “damaged,” however Accomplished means “completed efficiently.” Conventional scripts require us to hardcode each permutation of this logic, leading to 1000’s of strains of unmaintainable if/else statements. With the Codex CLI, we exchange these 1000’s of strains of code with a number of sentences of English within the system immediate: “Ignore jobs which have accomplished efficiently. Sync failing Jobs so we will monitor instability.” The LLM bridges the semantic hole. It understands what “instability” implies within the context of a job standing. We’re describing our intent, and the agent is dealing with the implementation.
Scaling Past Kubernetes
We began with Kubernetes as a result of it’s the “onerous mode” of configuration administration. In a manufacturing atmosphere with 1000’s of workloads, issues change continuously. An ordinary script sees a snapshot and infrequently will get it incorrect. An agent, nonetheless, can work by means of the complexity. It’d run its OODA loop a number of occasions to unravel a single situation—by checking logs, verifying dependencies, and confirming guidelines earlier than it ever makes a change. This capacity to attach reasoning steps permits it to deal with the dimensions and uncertainty that breaks conventional automation.
However the sample we established, agentic OODA Loops by way of MCP, is common. As soon as we proved the mannequin labored for Pods and Companies, we realized we might lengthen it. For legacy infrastructure, we may give the agent instruments to SSH into Linux VMs. For SaaS administration, we may give it entry to Salesforce or GitHub APIs. For cloud governance, we will ask it to audit AWS Safety Teams. The fantastic thing about this structure is that the “mind” (the Codex CLI) stays the identical. To help a brand new atmosphere, we don’t must rewrite the engine; we simply hand it a brand new set of instruments. Nevertheless, shifting to an agentic mannequin forces us to confront new trade-offs. Probably the most rapid is value versus context. We realized the onerous means that you just shouldn’t give the AI the uncooked YAML of a Kubernetes deployment—it consumes too many tokens and distracts the mannequin with irrelevant particulars. As a substitute, you create a instrument that returns a digest—a simplified JSON object with solely the fields that matter. That is context optimization, and it’s the key to operating brokers cost-effectively.
Conclusion: The Human within the Cockpit
There’s a concern that AI will exchange the DevOps engineer. Our expertise with the Codex CLI suggests the other. This expertise doesn’t take away the human; it elevates them. It promotes the engineer from a “script author” to a “mission commander.” The stale CMDB was by no means actually a knowledge drawback; it was a labor drawback. It was merely an excessive amount of work for people to manually monitor and too complicated for easy scripts to automate. By introducing an agent that may motive, we lastly have a mechanism able to maintaining with the cloud.
We began with a small Kubernetes cluster. However the vacation spot is an infrastructure that’s self-documenting, self-healing, and essentially intelligible. The period of the brittle sync script is over. The period of infrastructure as intent has begun!
