Q: Does including AGENTS.md
increase token usage and cost, and how do you prevent context overflow?
Yes, including AGENTS.md
(or similar rule files like CLAUDE.md
) does consume tokens, but the practical impact is typically manageable with modern LLMs. The key concerns are initial token cost and eventual context limit exhaustion during long conversations.
Token Consumption
The AGENTS.md
file (and other initial context like system prompts) is sent once per conversation, not per message. However, it contributes to the total token count throughout the conversation’s lifetime. Modern models have massive context windows — often 200,000 tokens or even 1 million tokens — making it difficult to fill entirely with a single requirements document.
The Real Risk: Context Limit and Information Loss
The primary concern isn’t the initial cost, but what happens when a long conversation eventually hits the context limit. At that point, the agent must discard information, either by:
Dropping old messages: A very lossy approach that removes historical context entirely.
Summarizing the conversation: Less lossy but still imperfect — no matter how good the summarization, some information is inevitably lost.
Strategy: Be Mindful, Not Stressed
Adopt a balanced approach:
Curate carefully when combining many large sources (multiple MCPs, entire libraries, extensive documentation).
Don’t over-optimize for a single requirements document like
AGENTS.md
. The clarity and guidance it provides almost always justifies the token cost.
Practical Takeaway
For most use cases, the benefits of a well-written AGENTS.md
—clear instructions, reduced ambiguity, better agent behavior—far outweigh the token cost. Focus on clarity and completeness in your requirements document, and only worry about aggressive curation when you’re combining multiple large context sources or anticipating very long conversations.