I've stumbled on the same workflow. Except for one thing: If I just do as OP does, Claude Code will tend to overengineer. For example it'll build complex solutions to super rare race conditions that have trivial fallout. But I've found that all it takes is a "skeptical pass". Here's how it goes: After having a bunch of specialist subagents review the (plan/implementation), after doing the deduplication/synthesis of their findings, the main agent will bucket them into A) Trivial/obvious fix B) there's multiple possible resolutions, but the LLM had a strong lean, so it went with it on its own C) Genuine ambiguity, where it asks me what to do (and presents its lean) and D) Wontfix. Crucially, after doing this, I have it run a "skeptical pass" where it takes a hard look at these findings and see if maybe some of them deserve to be downgraded. Generally, a lot of things make their way into wontfix this way. I find, I don't need to push back against overengineering, I can have the LLM do so itself, and it'll actually do a decent job of it.
Most writings about the spec-driven development I see start with a product requirements document that is assumed to be valid. But I doubt that's the case. If so, you would've written about it, and probably would've involved agents in the research that goes into it. My gut feeling tells me there's much more emphasis on implementing the feature than on questioning if it's relevant, feasible, and based on valid assumptions.
Personally it's well-defined and agentic - just not circulated.
/understand - agents interrogate the problem
/huddle - Thinking panel turns it into a PRD - attacks the premise, PRDs regularly die here
/tm - claude-task-master breaks the survivor into a dependency graph
Nobody writes this half up because "agent talked me out of building it" demos worse than "agent built it".
Sorry I have to ask. How senior are you? The notion that I‘d allow an agent to talk me out of something seems weird. 99% of cases, it’s the other way around. Architecture is just not where they shine.
What’s your process? My experience matches yours, but then again I usually just give a few lines to codex. I imagine if I tried harder to give detailed specs as input, the agent would have a lot more room to spot flaws and kill the plan.
This has been my attempt at wrangling the new A.I. assisted development that seems to be overtaking the software engineering profession. I jumped head first into LLM development after observing the trends from the last year and it appears this process might be a viable path forward.
I’ve had similar feelings how can I trust this if I no longer write the code directly.
I wrote an /assess tool. I designed it to be token light but assesses on everything I could do to regain trust and help AI to improve my code base not by add features but by adding discipline.
https://www.nrc.gov/docs/ML1433/ML14338A739.pdf
/understand - agents interrogate the problem /huddle - Thinking panel turns it into a PRD - attacks the premise, PRDs regularly die here /tm - claude-task-master breaks the survivor into a dependency graph
Nobody writes this half up because "agent talked me out of building it" demos worse than "agent built it".
I wrote an /assess tool. I designed it to be token light but assesses on everything I could do to regain trust and help AI to improve my code base not by add features but by adding discipline.