Blog

AI Agents are automation, grown up

Stop asking if agents work. Start asking what they should do.

For years, people have known which bits of their job are repetitive and could be automated. They just couldn’t quite get a computer to do it. The inputs were too messy, the logic was too contextual, and describing every edge case to a machine took more effort than just doing the thing.

AI, and LLMs, have changed that. Automation could always send the email. What it couldn’t do was read the messy input that should have triggered the email, or write something contextually appropriate once triggered. Those two capabilities, reading unstructured text and producing a useful first draft, are what have shifted. A human still reviews and edits, but the starting point is no longer a blank page, and the input doesn’t have to be cleaned up by hand before a machine can act on it.

There’s a third capability worth mentioning: deciding what to do next. A traditional workflow follows a fixed sequence. An AI agent can make decisions about next steps to take. It can look at what it just found, decide whether the goal is met, and if not, choose a different action and try again. A coding agent that reads an error and works out how to fix it is the canonical example.

Four layers, not one platform

There isn’t a single agent platform you buy and plug in. There are four distinct layers, and seeing them separately makes it much easier to work out what you actually need.

The LLM API

At the bottom is the LLM API, like Claude or GPT-5. This is the bit doing the actual reading, writing & reasoning.

Agent frameworks

Above that are the agent frameworks, like LangChain, LangGraph & CrewAI. These handle the agent loop of goal, action, result, next step. You could write that loop yourself in fifty lines of Python, and many people do. Frameworks become useful when things get more complicated, like multiple agents handing off to each other, human approval steps mid-task, or complex retry logic. Whether they add more than they cost in complexity depends entirely on what you’re building.

Automation platforms

Above that are the automation tools like n8n, Make, GitHub Actions & Zapier. These have existed for years, and neatly extend to AI agents. They handle triggers, scheduling, and connecting services. An agent can be kicked off by something happening, like an email arriving, a form being submitted, or someone pressing a button inside a tool they already use. It can also run on a timer, like every morning at 8am. Or it can run on demand when someone types a request into a chat interface. What’s changed is that the workflows now have an LLM node in the middle.

The interface

At the top is the interface, or how people interact with the agent, which is almost never a purpose-built agent UI. More on that below.

How the layers combine

How these layers combine depends entirely on the task. The simplest case is something like a daily news digest. A GitHub Actions job runs every morning, searches for relevant news, passes the results to Claude, and sends you a summary by email. All four layers are technically there, but the framework layer is essentially absent. The sequence is fixed and the LLM only needs to make one judgement call about what to flag.

A sales follow-up agent is a step up in complexity but the shape is similar. n8n watches your CRM for prospects who have gone quiet, and passes the contact history to Claude, which drafts a contextually appropriate email. That draft goes into Slack or appears as a Gmail draft for a human to approve before it goes anywhere. Still no framework needed, because the logic is straightforward and the value sits in the reading and writing, not in autonomous decision-making.

A document assistant is where the framework layer starts to do real work. You describe what you need in a chat window, the agent retrieves the relevant sections of your guidance documents, reasons over them, sometimes loops back for clarification, and then drafts the appropriate clauses. Retrieval, reasoning, conditional looping. A RAG pipeline is relevant here in a way it wouldn’t be for the news digest. As more workflows need the agent to genuinely decide what to do next, rather than read and write inside a fixed sequence, the framework layer will go from optional to essential.

There’s no agent UI, and that’s a feature

The output of an agent typically appears somewhere that already exists, rather than through a whole new purpose-built UI. A draft in your inbox. A message in Slack. A document in Google Drive. A notification on your phone. The news digest arrives as an email. The sales follow-up appears as a draft in your email client. The document assistant produces a file in the place your other files live.

This matters. We could assume that a new interaction pattern is coming, with some purpose-built workspace where you go to talk to your fleet of agents. The evidence so far points the other way. Agents aren’t a new place to work. They’re a new capability inside the places you already work. The best deployments tend to feel almost invisible, because the output just appears where you already expected it.

The hard part is knowing what to build

The four layers are mostly off-the-shelf now. The frameworks are improving every month. The models are good enough. The hard part isn’t any of that. The hard part is working out what the agent should actually do.

That sounds obvious, and it gets skipped a lot. People reach for the tools before they’ve understood the work. The result is an agent that automates the wrong step, automates something nobody really wanted automated, or produces output in a format that doesn’t fit how people actually do their jobs.

Working this out properly means talking to the people who do the work today. Not a quick chat, but properly understanding what they do, why they do it, what they find tedious, what they’re worried about getting wrong, and where the judgement calls really sit. Often the task someone describes in the abstract is different from the task they actually do. The interesting bits, the bits that would benefit most from an agent, only come out when you watch or ask in detail.

This is the part of the work that doesn’t get talked about much, because it isn’t about the technology. But it’s where most of the value sits, and it’s why so many agent projects miss.

Where to start

The clearest places to look are tasks where a human currently reads something, makes a judgement, and does something based on what it says. Three questions help narrow it down.

Is the input messy? If it arrives as free text, emails, notes or transcripts, the LLM is doing helpful work. If it arrives as clean rows in a spreadsheet, you probably don’t need an agent. You need a script.

Is the output a first draft? Agents are at their best when a human stays in the loop and the agent shortens the path to something usable. They’re weaker when the output is the final word and nobody is going to check it.

Is the task currently getting skipped, or done badly, because nobody has time? Those are the highest-value targets. Automating something that’s already being done well saves time. Automating something that isn’t getting done at all unlocks capacity that wasn’t there before.

The interesting question isn’t whether the tools are ready. It’s which of your existing tasks is the first one worth pointing them at. That question starts with the people doing the work, not the tools.