How to avoid LLMs struggling with Lisp parens?

Question

LLMs seem to love certain languages (Python, Bash, etc.), but they all seem to struggle with Lisp (e.g. Racket or Emacs Lisp). I've tried various iterations of Claude, as well as cheaper models like DeepSeekV4, etc. and the pattern is the same: they'll make a few successful edits, but eventually they'll get some parentheses slightly wrong, then spiral into madness as they attempt to fix the syntax errors by counting and matching-up parentheses "manually" in a never-ending loop.This is frustrating for two reasons:Firstly, LLMs are famously bad at counting characters (e.g. the number of "r"s in "strawberry"), so it's no wonder this approach of generating and counting characters doesn't work very well.Secondly, balancing parentheses is trivial for traditional, non-LLM algorithms; so it feels like an entirely avoidable problem (without resorting to larger, more-expensive models).Is anyone using LLMs successfully on Lispy projects? If so, what workflows, tooling, etc. have you found to work well? I've tried guiding them to use Emacs `check-parens` rather than counting "manually"; but maybe inferring from indentation might work better? Perhaps tree-based generation/tools would avoid introducing such problems in the first place?

backrun · Accepted Answer

I suspect the most the reliable approach is to stop treating the model as the syntax checker. Let it propose a small diff, then run the result through check-parens, the language parser, or a formatter before accepting the change. if valifation fails, feed back the excat parser error and smallest affected form rather than the entire file