Is that commonly done. Presumably this is from people customizing their installs - is that correct?
claude-local () {
MODEL=$(curl --silent localhost:1234/api/v1/models | jq 'first(.models[].loaded_instances[].id)')
ANTHROPIC_BASE_URL=http://localhost:1234 ANTHROPIC_AUTH_TOKEN='' claude --model $MODEL
}
Fun experiment: run `claude` and `claude-local` side by side and paste the same prompt into both. In my experience, recent open weight models (Qwen, Gemini) are pretty solid on quality, even on moderately difficulty prompts. They get the "right" answer eventually but roughly 10x slower on my M3 mac.
A few environment variables and Claude Code uses DeepSeek. I was actually surprised how easy that is.