For a while I used coding agents for small, self-contained things. Refactor this method. Rename this variable. Extract this into a function. Safe, short instructions where the agent couldn't go too far off track.
But I kept seeing people push further — giving agents a longer leash, taming them into handling real features end to end. And it was working. The gap between what I was doing and what seemed possible was growing.
So I started trying. And the results were mixed. Sometimes the agent would nail it. Other times it would go in circles, making reasonable-looking choices that were completely wrong for the project. Wrong abstraction. Wrong assumptions about the data model.
The difference, I've come to believe, is context. Not prompt engineering tricks, not model selection. Context — how much of it you provide, and how you provide it.
The colleague problem
When I sit down with a colleague to discuss a feature, we share a tremendous amount of context without even thinking about it. We both know the codebase. We both know the users. We both know the last three decisions that shaped the architecture. We both know which parts of the system are fragile and which are rock solid.
An agent knows none of this. It starts every conversation from zero. And yet I routinely catch myself typing something like "this thing is happening and it shouldn't" and expecting the agent to make the same choices I would.
It won't. It can't. Not because it's incapable, but because it doesn't have the information it needs.
What actually helps
I've started paying attention to what makes a prompt produce good results versus mediocre ones. It's not about writing more. It's about including the right things:
- What I'm building and why. The why matters more than I expected. "Add a login page" and "add a login page because we need to gate access to paid features and our users are non-technical" lead to very different implementations.
- Constraints. What libraries am I already using? What patterns does the codebase follow? What can't change? Normally I provide those in the form of documentation od the project.
- Decisions already made. If I've already decided on JWT over sessions, I say so. Otherwise the agent might pick the other one and I'll waste twenty minutes unwinding it.
- What I'm unsure about. This is the one I used to skip. But telling the agent where my thinking is still fuzzy turned out to be incredibly valuable. It fills gaps I didn't know I had.
- How to know when it's done. This one changed everything for me. If you can tell the agent "run this test suite" or "check that the API returns 200 on this endpoint," you've closed the loop. The agent can try something, verify it, and try again if it didn't work. Without a definition of done, it generates code and hopes. With one, it iterates.

And that last point is where things start to feel different. When the agent can check its own work, it stops feeling like a text generator and starts feeling like a harness — like you're constraining the execution of a wild creature that can produce thousands of lines of code per second, and your job is to point it in the right direction and let it run.
The agent as a thinking partner
The shift that changed my workflow the most was to stop treating the agent as a code generator and start treating it as a thinking partner.
I used to try to write a complete spec before handing anything to the agent. That felt responsible. But it was often slower and worse than writing a rough one and refining it together. The agent spots gaps in my thinking. It asks about edge cases I haven't considered. It surfaces assumptions I didn't know I was making.
My workflow used to be: think hard, write complete spec, hand it to the agent, get code back. Now it's more like: write down what I know, let the agent challenge and expand it, arrive at a better spec together, then build from that shared understanding.
The second approach produces better specs and better code, because the agent understands the reasoning behind the decisions, not just the decisions themselves.
Building this into the process
I liked this idea enough that I built it into a reusable skill I call interview-spec. You point it at a rough specification file, and instead of immediately generating code, it interviews you.

First, it reads the spec and identifies what's there and, crucially, what's missing. It looks for gaps in error handling, edge cases, security considerations, performance implications, and operational concerns.
Then it starts asking questions. Not obvious ones. It asks about race conditions in the data model. About what happens when the network fails halfway through a transaction. About how to roll this back if something goes wrong in production. About which parts are truly MVP and which are nice-to-haves that got unconsciously promoted to requirements.
It asks 2-4 questions at a time, drills deeper based on answers, and periodically summarizes what it has learned to make sure we're on the same page.
By the end, it updates the original spec with everything that was uncovered and produces a prioritized task breakdown with complexity estimates and dependencies.
The key insight for me was that this back-and-forth is not overhead. It's the actual work. The conversation is the specification process. Every question the agent asks is a question that would have surfaced eventually — probably at a much more expensive time, like during implementation or, worse, in production.
Making context persistent
This goes beyond individual conversations. I've started making context available in places the agent can find on its own:
- A
AGENTS.md(which I symlink toCLAUDE.md) file in each repository root with project conventions, architecture decisions, and patterns to follow. - Clear commit messages and PR descriptions. The agent reads those too.
- Documentation that captures why things are the way they are. Not exhaustive docs — just the decisions that matter.
Every piece of context I make available is a decision the agent doesn't have to guess about.
Letting the agent find its own context
There's another dimension to this that took me a while to appreciate: you don't have to hand the agent all the context yourself. You can let it go get it.
Tools are how an agent builds its own context. Every tool call is the agent reaching out into the world to learn something it didn't know. Reading a file, running a command, hitting an API — these aren't just actions, they're the agent filling in its own gaps.
And the interesting part is that you don't always need formal tool integrations for this. Sometimes just reminding the agent what's available on the system is enough. If gh is installed and configured, telling the agent it can use it means it can traverse a PR, read its comments, check CI status, and piece together the full story behind a bug — all without me having to copy-paste any of that into the prompt.
I've started thinking about it this way: I provide the why and the constraints, and I make sure the agent has the tools to discover the what on its own. That split turns out to be a much better use of everyone's time than trying to front-load every detail myself.
The real cost
Providing good context takes effort. It forces me to articulate things that feel obvious and to confront the parts of my plan that are still vague. I can't hide behind ambiguity anymore.
But that effort isn't wasted. It's the same effort I'd spend debugging a misguided implementation, rewriting code that went in the wrong direction, or wondering why the agent "did it wrong" when the real problem was that I never said what "right" looked like.
Context is not a tax on using an agent. Context is the work. The agent just makes that truth impossible to ignore.