HACKER Q&A
📣 baijum

Do we need a language designed specifically for AI code generation?


Let's run a thought experiment. If we were to design a new programming language today with the primary goal of it being written by an AI (like Copilot) and reviewed by a human, what would its core features be?

My initial thoughts are that we would trade many of the conveniences we currently value for absolute, unambiguous clarity. For example:

- Would we get rid of most syntactic sugar? If there's only one, explicit way to write a `for` loop, the AI's output becomes more predictable and easier to review.

- Would we enforce extreme explicitness? Imagine a language where you must write `fn foo(none)` if there are no parameters, just to remove the ambiguity of `()`.

- How would we handle safety? Would features like mandatory visibility (`pub`/`priv`) and explicit ownership annotations for FFI calls become central to the language itself, providing guarantees the reviewer can see instantly?

- Would such a language even be usable by humans for day-to-day work, or would it purely be a compilation target for AI prompts?

What trade-offs would you be willing to make for a language that gave you higher confidence in the code an AI generates?


  👤 dtagames Accepted Answer ✓
LLMs don't work the way you think. In order to be useful, a model would have to be trained on large quantities of code written in your new language, which don't exist.

Even after that, it will exhibit all the same problems as existing models and other languages. The unreliability of LLMs comes from the way they make predictions, rather than "retrieve" real answers, like a database would. Changing the content and context (your new language) won't change that.


👤 theGeatZhopa
What's needed is a formalization and that formalization to been trained on. In not sure if systemprompt alone is powerful enough to check and enforce input as definite and exact formalized expression(s).

I don't think it will work out easily like "a programming language for LLM" - but you can always have a discussion with ol' lama


👤 muzani
Generally they work better with words that are more easily readable by humans. They have a lot of trouble with JSON and do YAML much better, for example. Running through more tokens doesn't just increase cost, it lowers quality.

So they'd likely go the other way. It's like how spoken languages have more redundancies built in.