www.mattcollins.net
Perhaps it'll take someone exploiting this at scale (hopefully in a fairly harmless way) to jolt us to our senses a bit.
Perhaps it'll take someone exploiting this at scale (hopefully in a fairly harmless way) to jolt us to our senses a bit.
Like traditional tests, it takes discipline to add and maintain these evals but without them you don't know what you're breaking / making worse.
Like traditional tests, it takes discipline to add and maintain these evals but without them you don't know what you're breaking / making worse.