Show HN: CodeRLM – Tree-sitter-backed code indexing for LLM agents

(github.com)

77 points | by jared_stewart 1 day ago

9 comments

mkw5053 19 hours ago
Aider [0] wrote a piece about this [1] way back in Oct 2023!
I stumbled upon it in late 2023 when investigating ways to give OpenHands [2] better context dynamically.
[0] https://aider.chat/
[1] https://aider.chat/2023/10/22/repomap.html
[2] https://openhands.dev/
[-]
- emporas 9 hours ago
  Aider's repomap is a great idea. I remember participating in the discussion back then.
  The unfortunate thing for Python that the repomap mentions, and untyped/duck-typed languages, is that function signatures do not mean a lot.
  When it comes to Rust, it's a totally different story, function and method signatures convey a lot of important information. As a general rule, in every LLM query I include maximum one function/method implementation and everything else is function/method signatures.
  By not giving mindlessly LLMs whole files and implementations, I have never used more than 200.000 tokens/day, counting input and output. This counts as 30 queries for a whole day of programming, and costs less than a dollar per day not matter which model I use.
  Anyway, putting the agent to build the repomap doesn't sound such a great idea. Agents are horribly inefficient. It is better to build the repomap deterministically using something like ast-grep, and then let the agent read the resulting repomap.
  [-]
  - jared_stewart 5 hours ago
    Typed languages definitely provide richer signal in there signatures - and my experience has been that I get more reliable generations from those languages.
    On the efficiency point, the agent isn't doing any expensive exploration here. There is a standalone server which builds and maintains the index, the agent is only querying it. So it's closer to the deterministic approach implemented in aider (at least in a conceptual sense) with the added benefit that the LLM can execute targeted queries in a recursive manner.
- jared_stewart 18 hours ago
  Aider's repo-map concept is great! thanks for sharing, I'd not been aware of it. Using tree-sitter to give the LLM structural awareness is the right foundation IMO. The key difference is how that information gets to the model.
  Aider builds a static map, with some importance ranking, and then stuffs the most relevant part into the context window upfront. That's smart - but it is still the model receiving a fixed snapshot before it starts working.
  What the RLM paper crystallized for me is that the agent could query the structure interactively as it works. A live index exposed through an API lets the agent decide what to look at, how deep to go, and when it has enough. When I watch it work it's not one or two lookups but many, each informed by what the previous revealed. The recursive exploration pattern is the core difference.
  [-]
  - anotherpaulg 17 hours ago
    Aider actually prompts the model to say if it needs to see additional files. Whenever the model mentions file names, aider asks the user if they should be added to context.
    As well, any files or symbols mentioned by the model are noted. They influence the repomap ranking algorithm, so subsequent requests have even more relevant repository context.
    This is designed as a sort of implicit search and ranking flow. The blog article doesn’t get into any of this detail, but much of this has been around and working well since 2023.
    [-]
    - jared_stewart 5 hours ago
      I see, so the context adapts as the LLM interacts with the codebase across requests?
      That's a clever implicit flow for ranking.
      The difference in my approach is that exploration is happening within a single task, autonomously. The agent traces through structure, symbols, implementations, callers in many sequential lookups without human interaction. New files are automatically picked up with filesystem watching, but the core value is that the LLM can navigate the code base the same way that I might.
- mkw5053 19 hours ago
  I just looked and it was posted a number of times with 0 discussion
  https://news.ycombinator.com/item?id=38062493
  https://news.ycombinator.com/item?id=41411187
  https://news.ycombinator.com/item?id=40231527
  https://news.ycombinator.com/item?id=39993459
  https://news.ycombinator.com/item?id=41393767
  https://news.ycombinator.com/item?id=39391946
- mohsen1 10 hours ago
  I am planning to add similar concepts to Yek. Either tree-sitter or ast-grep. Your work here and Aider's work would be my guiding prior art. Thank you for sharing!
  https://github.com/mohsen1/yek
whytai 11 hours ago
i thought the reason claude code defaults to terminal-ish workflows (glob/grep) is bc they trained with bash-y sandboxes, and the creator argued for this approach vs indexing + bespoke tools. would be interesting to see how often the model defaults to using grep for everything (in my experience almost always..)
ozozozd 19 hours ago
Great idea! I’ve been thinking about something along these lines as well.
I recommend configuring it as a tool for Opencode.
Going from Claude Code to Opencode was like going from Windows to Mac.
[-]
- MrGreenTea 9 hours ago
  And going to pi agent is like ascending to Linux ;)
- kiyoakii 13 hours ago
  yeah I would definitely recommend the same. I'm a daily user of opencode and I really want to try this.
- jared_stewart 18 hours ago
  will take a look at opencode, thanks for sharing!
d5ve 17 hours ago
I wonder how this sort of thing compares with asking claude to read a ctags file. I have git hooks set up to keep my tags up to date automatically, so that data is already lying around.
[-]
- HarHarVeryFunny 8 hours ago
  ctags just gives you locations of symbol definitions.
  TreeSitter will also give you locations of symbol usages, which is obviously very useful to an AI agent. You can basically think of Treesitter as having full syntactic knowledge of the code it is looking at - like a compiler's AST.
  There is also a more powerful cousin of ctags, cscope (C/C++) and Pycscope (python) that additonally gives usage locations, and more, as well as gtags that does similar, but supports more languages.
handfuloflight 14 hours ago
Excellent share, thank you. My question is with your setup, how strictly does Claude Code adhere to using this mode to traverse the codebase over grep? I have found this is to be a huge issue when implementing similar solutions... it loves to just grep.
esafak 20 hours ago
Can you make the plugin start automatically, on some suitable trigger? Any plans to support JVM languages?
edit: Does Claude not invoke it automatically, then, so you have to call the skill?
[-]
- jared_stewart 19 hours ago
  I've been tinkering with it substantially and the most I can say is that it generally doesn't trigger automatically :( Claude has a really, really strong affinity for it's existing tools for exploring a code base.
  I'd be happy to add support for scala and java - the current binary size is 11MB on my machine, so I think there's an opportunity to expand what this offers. At this time I don't know where I would draw the line of I'm not planning on supporting a thing. I think to some degree it would depend on usage / availability on my part
skybrian 20 hours ago
Would this be useful to people who aren't using Claude? Maybe it should be installable in a more normal way, instead of as a Claude plugin.
[-]
- jared_stewart 20 hours ago
  I don't see why it wouldn't - but I'm not familiar with setup / integration on other platforms. Would love to hear more about your stack and see if we can't find a way for you to try it out
  [-]
  - esafak 18 hours ago
    A CLI or slim MCP would do it. IF you want a formal plugin, here's another popular ecosystem: https://opencode.ai/docs/plugins/
    [-]
    - jared_stewart 5 hours ago
      The server exposes a straightforward API so wrapping it in MCP should be straight forward. The agent / skill interacts with the server using the cli implementation (part of the skill definition) at https://github.com/JaredStewart/coderlm/blob/main/plugin/ski...
      [-]
      - esafak 3 hours ago
        Since it appears to be a REST API you could just share the OpenAPI spec and let the agent cURL it.
    - MrGreenTea 8 hours ago
      My experience is best with cli and a concise skill.md
esafak 21 hours ago
I see a lot of overlap with LSPs, which better agents already use, so I would appreciate a comparison. What does this add?
[-]
- HarHarVeryFunny 9 hours ago
  LSP is designed to help editors, not AI agents, and provides query by cursor position. Treesitter supports query by symbol (find definition, usages, etc), and so is much better suited to what an AI agent may want to do.
  [-]
  - esafak 8 hours ago
    My agent already uses LSP plugins, and the protocol supports querying by symbol:
    https://microsoft.github.io/language-server-protocol/specifi...
    https://microsoft.github.io/language-server-protocol/specifi...
    [-]
    - HarHarVeryFunny 8 hours ago
      Thanks - I wasn't aware of that, although it still doesn't seem to be what would be most useful to an AI agent.
      For example, if the agent wants to modify a function, it may want to know all the places the function is called, which AFAIK Treesitter can provide directly, but it seems with LSP you'd have to use that DocumentSymbol API to process every source file to find the usages, since you're really searching by source file, not by symbol.
      [-]
      - esafak 7 hours ago
        For that I believe you could use textDocument/references:
        https://microsoft.github.io/language-server-protocol/specifi...
        [-]
        HarHarVeryFunny 7 hours ago
        Yes, but that one is position-based rather than name-based - I believe it's basically for an editor to ask about whatever is under the cursor.
- jared_stewart 20 hours ago
  Tree-sitter and LSP solve different problems.
  LSP is a full fledged semantics solution providing go-to-definition functionality, trace references, type info etc. But requires a full language server, project configuration, and often a working build. That's great in an IDA, but the burden could be a bit much when it comes to working through an agent.
  Tree-sitter handles structural queries giving the LLM the ability to evaluate function signatures, hierarchies and such. Packing this into the recursive language model enables the LLM to decide when it has enough information, it can continue to crawl the code base in bite sized increments to find what it needs. It's a far more minimal solution which lets it respond quickly with minimal overhead.
aghilmort 21 hours ago
been wondering about treesitter grepping for agents
how do plans compare with and without etc. evven just anecdotally what you've seen so far etc
[-]
- jared_stewart 20 hours ago
  anecdotally, it seems like this helps find better places for code to sit, understands the nuances of a code base better, and does a better job avoiding duplicate functionality.
  it's still very much a work in progress, the thing I'm struggling with most right now is to have claude even using the capability without directly telling it to.
  there seems to be benefits to the native stack (which lists files and then hopes for the best) relative to this sometimes. Frankly, it seems to be better at understanding the file structure. Where this approach really shines is in understanding the code base.
  [-]
  - aghilmort 2 hours ago
    one approach that can work is to tell model to load read skill and/or call shell script that overloads default, there are variety of ways to attempt this with any harness, claude specifically has hooks some of which allow go, no go, do this instead etc. and ya, agree on grokking code base, ast integration feels like natural next step