Instead treat it as an iterative process and work slowly towards building a portfolio of good outputs and reviewing them consistently
ivanleo.com/blog/youre-p...
Instead treat it as an iterative process and work slowly towards building a portfolio of good outputs and reviewing them consistently
ivanleo.com/blog/youre-p...
See my favourite stupid evals at ivanleo.com/blog/write-s...
See my favourite stupid evals at ivanleo.com/blog/write-s...
• can't migrate underlying models safely
• can't add new features with confidence
• can't ship without HITL evals, which takes >100x longer
• product development and iteration grinds to a halt
• lose customer trust due to poor user experience
• can't migrate underlying models safely
• can't add new features with confidence
• can't ship without HITL evals, which takes >100x longer
• product development and iteration grinds to a halt
• lose customer trust due to poor user experience
www.ivanleo.com/blog/youre-p...
www.ivanleo.com/blog/youre-p...