How would you know if an AI model has been nerfed?

Question

If consumers are just calling a model in the cloud, what&rsquo;s actually enforcing that the model is not just a dumbed down cheaper model?What&rsquo;s stopping anthropic/openai from just running a cheaper LLM model based on how difficult the question is?

verdverm · Accepted Answer

To detect, you need evalsTo have guarantees, contractsWhen I use the API, I specify the model. They may use a quant'd version, but they aren't going to change the underlying model (caveat that I'm calling the quant and not the same model in this context)

How would you know if an AI model has been nerfed?

If consumers are just calling a model in the cloud, what’s actually enforcing that the model is not just a dumbed down cheaper model?
What’s stopping anthropic/openai from just running a cheaper LLM model based on how difficult the question is?

To detect, you need evals
To have guarantees, contracts
When I use the API, I specify the model. They may use a quant'd version, but they aren't going to change the underlying model (caveat that I'm calling the quant and not the same model in this context)