Tips for reducing LLM token usage?

Question

I've been using Claude Code with Serena MCP, but for the past few weeks it's been compressing the context more often. I have two Pro accounts, and it's still not enough for my daily needs anymore :(Also, Claude Code tends to make very broad search requests, and I keep getting an error from MCP about exceeding 25,000 characters. It happens quite often.What would you recommend?

bigyabai · Accepted Answer

> What would you recommend?Invest in a local inference server and run Qwen3. At this point it will still cost less than two pro accounts.