Yeah everyone needs their own evals I guess, my evals are probably not the same as other peoples. Personally I'm not that fussed about these 'reasoning models' I'm more excited about better coding and agent/tool using models.
Yeah everyone needs their own evals I guess, my evals are probably not the same as other peoples. Personally I'm not that fussed about these 'reasoning models' I'm more excited about better coding and agent/tool using models.