‎

Config management in an agentic world

At the end of last year, Progress announced that the open-source Chef Infra Server is being retired: deprecated now, with end-of-life in November 2026 and final updates ending in October 2026. Chef Infra Client, InSpec, Chef Workstation and the wider open-source ecosystem continue to be maintained, so this is not the disappearance of Chef as a whole. Still, for me, the retirement of the open-source server marks more than the end of a piece of infrastructure software. It feels like the end of a particular way of thinking about systems.

I came to Chef very early. I wrote one of the first books about it, spent years teaching it, and built a career around the idea that infrastructure should be treated as software. Like many engineers of my generation, I genuinely believe the configuration management movement changed our industry for the better.

It taught us that systems should be described rather than hand-built, that automation should be tested, that infrastructure belonged in version control, and that reproducibility mattered. Those ideas won, and they won so thoroughly that the tools which first embodied them now feel like products of another era.

Kubernetes didn't solve configuration management

It's tempting to think that Kubernetes made configuration management obsolete. After all, if every application is a container, every deployment is declarative, and every cluster continuously reconciles itself, what role is left for Chef or Puppet?

Yes, in a cloud-native world most people are running containers. CNCF's annual survey has container adoption at over 90% of organisations, which is hardly a niche. But even so, a significant amount of computing still runs on bare metal or virtual machines. The reasons are messy and real: sometimes genuine performance needs, sometimes plain organisational intransigence, sometimes licensing or vendor constraints, sometimes FUD and sometimes because the thing is just a bad fit for a container.

So there is still a real, live need to configure things that look like operating systems. And if you are not reaching for Chef to do it, then what? Ansible? Nix? Or maybe there is a new way to think about the problem.

But Kubernetes did not abolish convergence — the loop that continuously drives the observed state of a system toward the state you declared. It absorbed it into the controller.

Linux has no equivalent. A package manager isn't continuously reconciling installed software. A configuration file isn't watching itself for drift. Systemd isn't recreating deleted users or repairing broken configuration. The operating system is fundamentally mutable.

Historically, configuration management systems solved this by bringing their own convergence loop. Chef, Puppet and CFEngine ran that loop as an agent on the machine; Ansible did the same work on demand, pushed from a control node rather than run by a daemon. For twenty years, that was the right abstraction: bring a convergence loop to a mutable operating system.

When you write

replicas: 3

in Kubernetes, you are not describing how to create three running containers. You are describing the outcome you want, and delegating the ongoing work of making reality match that declaration.

The controller is doing almost all of the interesting work. It watches the world, compares it to the goal, and takes action when they diverge. That is convergence. It is also exactly what the Chef client did: wake up, inspect the machine, perform whatever imperative operations are necessary, and keep working until the observed state matches the desired one.

The difference is not the mechanic. The difference is where the mechanic lives. With Chef, the convergence loop was a thing you installed on the machine and ran. With Kubernetes, it is part of the platform, always on, and mostly out of sight. Kubernetes did not remove convergence. It hid it inside the controller. In that sense, k8s is the new chef-client.

Chef coupled three different ideas

Looking back, I think Chef's biggest strength was also one of its architectural limitations.

A Chef run fused desired state, the outcome we wanted; convergence, the imperative work required to get there; and observed state, what actually happened when it ran. They lived together inside a cookbook and a converge run. Recipes expressed intent while simultaneously implementing it, the client executed the reconciliation loop, and afterwards you had logs, node attributes and perhaps some reports — but the observed state of the system was never really a first-class object, a limitation that matters more when the thing building on that state is an agent, not a human reading a log. Once the run finished, what remained was largely the confidence that it had succeeded.

At the time, that seemed entirely natural. I am no longer convinced it is.

A different decomposition

Chef fused those three ideas because a human had to hold the whole run in their head. A recipe reads top to bottom like a story: this package, then this service, then this file, converging toward a machine you could picture. Convergence was the centre of gravity because convergence was the part a person needed to follow. The tool was optimised for human comprehension of how the change would happen.

Over the last few weeks I've been experimenting with Swamp, which starts by pulling those three ideas back apart. Desired state becomes typed model definitions. Convergence becomes methods and workflows that do the unavoidable imperative work. Observed state becomes immutable, versioned, queryable evidence — not a by-product of the run, but a first-class thing you can ask questions of afterwards.

When I talked this through with Adam Jacob — who co-founded Chef, and co-founded Swamp — he put the shift in a way that stuck with me. The previous generation of tools, he said, was about making declarations easier for humans to write. The next is about making the process an agent follows secure, deterministic and efficient.

It reframes the question you ask after a run. For twenty years that question was "did it succeed?" — did the human's intended change land. Swamp encourages a different one: "what do we now know?"

Every operation leaves durable evidence with provenance: a record of the system as it was actually observed, not as it was assumed to be. A human reading a converge log wanted reassurance. An agent building on an earlier step needs fact.

Why an agent needs them separated

A human running Chef could live with those concerns fused together. You held the run in your head, watched the story unfold, and read the log for reassurance that the machine ended up where you expected. That was not a flaw so much as the operating assumption of the tool.

An agent cannot work that way, especially not an agent picking up where another one left off. It cannot sensibly re-derive the world from a converge log. It needs to know, as fact, what is actually true now: what exists, what changed, and what current state it is allowed to build on. When observed state is only a by-product of a run, the next actor has nothing solid underneath it. It has to inspect again, guess, or trust that "it succeeded."

So the decomposition is not tidier architecture for its own sake. Typed models, schemas, immutable evidence and reviewable workflows matter because they turn observed state into something durable enough to compose with. That is the primitive the agentic world needs: not just automation that can make a change, but evidence an agent can safely inherit.

Configuration management isn't over

If autonomous agents become routine participants in software delivery, the central question is no longer simply "how do we declare desired state?" It becomes "what primitives do we give an agent so that it can safely automate a complex system?"

None of this makes convergence disappear. Machines still need packages installed, configuration files rendered, and services restarted. Those operations remain stubbornly imperative, just as they were in 1996.

What may be changing is where we choose to place the centre of gravity. For most of the last twenty years, configuration management revolved around convergence. In an era where automation is increasingly composed and executed by autonomous agents, convergence may become just another method on a model.

Perhaps the more important primitive is evidence.

I do not know yet whether the industry is really heading there. What I can say is that, from where I am standing, the shift feels real. For the first time since Chef convinced me that infrastructure should be treated as software, I have the feeling that the underlying model has moved again.

Back to main page