Show HN: First Claude Code client for Ollama local models

(github.com)

38 points | by SerafimKorablev 10 hours ago

7 comments

oceanplexian 6 hours ago
The Anthropic API was already supported by llama.cpp (The project Ollama ripped off and typically lags in features by 3-6 months), which already works perfectly fine with Claude Code by setting a simple environment variable.
[-]
- xd1936 6 hours ago
  And they reference that announcement and related information in the second line.
  [-]
  - gcr 5 hours ago
    Which announcement are you looking at? I see no references to llama-cpp in either Ollama's blog post or this project's github page.
dsrtslnd23 4 hours ago
What hardware are you running the 30b model on? I guess it needs at least 24GB VRAM for decent inference speeds.
[-]
- thtmnisamnstr 3 hours ago
  The general rule to follow is that you need as much VRAM as the model size. 30b models are usually around 19GB. So, most likely a GPU with 24GB of VRAM.
- ryandrake 4 hours ago
  I'd like to know this, too. I'm just getting started getting my feet wet with ollama and local models using just CPU, and it's obviously terribly slow (even 24 cores, 128GB DRAM. It's hard to gauge how much GPU money I'd need to plonk down to get acceptable performance for coding workflows.
horacemorace 5 hours ago
I was trying to get Claude code to work with llama.cpp but could never figure out anything functional. It always insisted on a phone home login for first time setup. In cline I’m getting better results with glm-4.7-flash than with qwen3-coder:30b
[-]
- g4cg54g54 4 hours ago
  ~/.claude.json with {"hasCompletedOnboarding":true} is the key, then ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN work as expected
eli 6 hours ago
There are already various proxies to translate between OpenAI-style models (local or otherwise) and an Anthropic endpoint that Claude Code can talk to. Is the advantage here just one less piece of infrastructure to worry about?
[-]
- g4cg54g54 6 hours ago
  siderailing here - but got one that _actually_ works?
  in particular i´d like to call claude-models - in openai-schema hosted by a reseller - with some proxy that offers anthropic format to my claude --- but it seems like nothing gets to fully line things up (double-translated tool names for example)
  reseller is abacus.ai - tried BerriAI/litellm, musistudio/claude-code-router, ziozzang/claude2openai-proxy, 1rgs/claude-code-proxy, fuergaosi233/claude-code-proxy,
  [-]
  - kristopolous 4 hours ago
    What probably needs to exist is something like `llsed`.
    The invocation would be like this
```
    llsed --host 0.0.0.0 --port 8080 --map_file claude_to_openai.json --server https://openrouter.ai/api
```
    Where the json has something like
```
    { tag: ... from: ..., to: ..., params: ..., pre: ..., post: ...}
```
    So if one call is two, you can call multiple in the pre or post or rearrange things accordingly.
    This sounds like the proper separation of concerns here... probably
    The pre/post should probably be json-rpc that get lazy loaded.
    Writing that now. Let's do this: https://github.com/day50-dev/llsed
dosinga 6 hours ago
this is cool. not sure it is the first claude code style coding agent that runs against Ollama models though. goose, opencode and others have been able to do that a while no?
d0100 6 hours ago
Does this UI work with Open Code?
[-]
- smissingham 4 hours ago
  [dead]
mchiang 7 hours ago
hey, thanks for sharing. I had to go to the Twitter feed to find the GitHub link:
https://github.com/21st-dev/1code
[-]
- dang 6 hours ago
  Thanks for catching that. I've changed the URL at the top to that from https://twitter.com/serafimcloud/status/2014266928853110862 now.