Prompt Engineering
5 prompt engineering lessons from a year of production agents
What actually moves the needle on agent reliability, distilled from a year of shipping agentic workflows in production.
Most prompt engineering advice online is either too abstract to apply or too specific to transfer. Here’s what has actually held up across a year of production multi-agent systems.
1. Constrain the output format explicitly
Don’t ask for “a summary.” Ask for a summary in a specific structure, with explicit field names, and validate the response against that structure. Ambiguous asks produce ambiguous — and inconsistently parseable — outputs.
2. Give agents a narrow job, not a broad one
A single agent trying to research, write, and review will do all three worse than three agents each doing one. Narrow responsibility makes failures easy to localize.
3. Put the constraints before the task
Models weight instructions that appear close to the actual task more heavily. State hard constraints immediately before you ask for the output, not buried in a system prompt three paragraphs earlier.
4. Reviewer agents need concrete criteria
“Improve this” is not a rubric. Give reviewer agents specific, checkable criteria — length, tone, citation requirements — or they’ll produce vague, unhelpful feedback loops.
5. Version your prompts like code
Prompts are production logic. Track changes, write down why a prompt changed, and be able to roll back a regression the same way you would a bad deploy.
Takeaway
None of this is exotic. It’s closer to normal software engineering discipline — clear contracts, narrow scope, versioning — applied to a probabilistic component instead of a deterministic one.
Related posts
What is MCP, and why every AI engineer should care
The Model Context Protocol standardizes how LLMs connect to tools and data. Here's why that matters for anyone building agentic systems.
Graph-RAG vs. vector RAG: when hybrid retrieval actually wins
Vector search alone misses multi-hop relationships. Here's when Graph-RAG earns its added complexity — and when it doesn't.