Ask HN: How do you give AI agents real codebase context without burning tokens?

Working on a large Rust codebase. The token problem is real — Claude Code will happily spend $5 of context just trying to understand how two modules relate before writing a single line. And once context compaction kicks in, it's even worse — the agent loses the thread completely and starts grepping the same files again from scratch.

Approaches I've tried:

Feeding CLAUDE.md / architecture docs manually — helps, but gets stale fast. Cursor's built-in indexing — breaks on monorepos, and I don't love proprietary code going to their servers. Basic MCP server with grep — works for exact matches, useless for semantic queries.

Eventually built something more serious: a local Tree-sitter indexer that builds a knowledge graph of file relationships and exposes it via MCP so agents query semantically instead of grepping blind. One tool call instead of 15 grep iterations. Published it here: https://github.com/Muvon/octocode

But genuinely curious what others are doing before I go deeper on it.

Three specific questions:

1. How do you handle the "ripple effect" problem — knowing that changing one file semantically affects others that aren't obviously linked?

2. Do you trust closed-source indexing with proprietary code, or have you gone local-first?

3. Has anyone gotten GraphRAG-style relationship mapping to work in practice at scale, or is it still mostly hype?

4 points | by donhardman 10 hours ago

2 comments

  • journal 16 minutes ago
    I feel like people will never understand because it's too easy to think that it should just work.

    Answer: the more you know yourself, the less you rely on the model.

    Do your own reasoning or you won't be able to afford to use these things soon.

  • brightside667 10 hours ago
    [dead]