If you’ve tested it, did anything surprise you (good or bad)? Is it worth switching from an existing OSS model?
Looking for real-world impressions, not just benchmarks.
First comparison was a 500 line Python program that 5 minuts later, gpt-oss:20b was silent. Canceled.
Put the same program on qwen3-coder and in about 20 to 30 seconds it made the summary of what the program does. Just top!
But other examples were so bad, I did not removed yet so I can do a few more tests but I will remove it soon.