HACKER Q&A
📣 vmt-man

Tips for reducing LLM token usage?


I've been using Claude Code with Serena MCP, but for the past few weeks it's been compressing the context more often. I have two Pro accounts, and it's still not enough for my daily needs anymore :(

Also, Claude Code tends to make very broad search requests, and I keep getting an error from MCP about exceeding 25,000 characters. It happens quite often.

What would you recommend?


  👤 bigyabai Accepted Answer ✓
> What would you recommend?

Invest in a local inference server and run Qwen3. At this point it will still cost less than two pro accounts.