How does GPT-OSS compare to other open-source models?

Question

How does it compare to other open-source LLMs such as DeepSeek, Qwen, and Gemma, especially in terms of reasoning & coding ability?If you&rsquo;ve tested it, did anything surprise you (good or bad)? Is it worth switching from an existing OSS model?Looking for real-world impressions, not just benchmarks.

roscas · Accepted Answer

Only compared with qwen3-coder and it's very bad.
First comparison was a 500 line Python program that 5 minuts later, gpt-oss:20b was silent. Canceled.
Put the same program on qwen3-coder and in about 20 to 30 seconds it made the summary of what the program does. Just top!
But other examples were so bad, I did not removed yet so I can do a few more tests but I will remove it soon.