How to connect Claude Code CLI to a local llama.cpp server
I’ve seen a lot of people struggling to get Claude Code working with a local llama.cpp setup, so here’s a quick guide that worked for me.
1. CLI (Terminal)
Add this to your .bashrc (or .zshrc):
bash export ANTHROPIC_AUTH_TOKEN="not_set" export ANTHROPIC_API_KEY="not_set_either!" export ANTHROPIC_BASE_URL="http://<your-llama.cpp-server>:8080"
Reload your shell:
bash source ~/.bashrc
and run the cli with the model argument:
bash claude --model Qwen3.5-35B-Thinking
2. VS Code setup with the Claude Code extension installed
Edit:
$HOME/.config/Code/User/settings.json
Add:
json "claudeCode.environmentVariables": [ { "name": "ANTHROPIC_BASE_URL", "value": "http://<your-llama.cpp-server>:8080" }, { "name": "ANTHROPIC_AUTH_TOKEN", "value": "dummy" }, { "name": "ANTHROPIC_API_KEY", "value": "sk-no-key-required" }, { "name": "ANTHROPIC_MODEL", "value": "gpt-oss-20b" }, { "name": "ANTHROPIC_DEFAULT_SONNET_MODEL", "value": "Qwen3.5-35B-Thinking-Coding" }, { "name": "ANTHROPIC_DEFAULT_OPUS_MODEL", "value": "Qwen3.5-27B-Thinking-Coding" }, { "name": "ANTHROPIC_DEFAULT_HAIKU_MODEL", "value": "gpt-oss-20b" }, { "name": "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC", "value": "1" } ], "claudeCode.disableLoginPrompt": true
Notes
- This setup lets you use
llama.cpp’s server (orllama-swap) to dynamically switch models by selecting one of the preconfigured ones in vscode. - Make sure the model names you define here exactly match what you configured in your
llama-server.ini.
Edit: So the cli actually did not perform that well in my local tests and i personally prefer other cli's to be true but after u/Robos_Basilisk asked how this plays well with context length that might have been the reason.
So you most probably want to use a model with less context length like the HAIKU model or additionally set the env. vars "CLAUDE_CODE_DISABLE_1M_CONTEXT" and "CLAUDE_CODE_MAX_OUTPUT_TOKENS".
For the list of supported env vars consult: https://code.claude.com/docs/en/env-vars
Edit: u/truthputer pointed out that you most probably also want to set the undocumented env. var: "CLAUDE_CODE_ATTRIBUTION_HEADER" : "0"
[link] [comments]



