HACKER Q&A
📣 subw00f

What would be the impact of a LLM output injection attack?


I'm talking inference layer compromise, someone being able to inject commands that would eventually be executed by agents/tools on the other side.

There is a massive amount of unskilled users letting LLMs decide which commands to run on their computers. I know for things like Cowork you have a sandbox, but many simply use Codex or Claude Code (and some even went above and beyond and learned to use --dangerously-skip-permissions). But what happens if an attacker is successful? What's even preventing it from happening?


  👤 mavdol04 Accepted Answer ✓
The worst that could happen is having your credentials stolen. It’s an LLM architectural flaw, so it has to be at the tools level so the only way to prevent it is still sandboxing in my opinion. Or at least sandboxing the tools themselves