But validating whether those tests actually cover meaningful edge cases seems harder.
Curious how teams here handle this in real workflows.
I've found it works better when the AI is just explaining results that come from deterministic metrics rather than inventing the analysis itself.
Curious how other teams are dealing with that.
Another counter-measure I have is to simply lock code before testing. Look over test files, and ensure its not following the happy path.