We’ll never dig ourselves entirely out of this hole until theory starts to catch up with practice.
Paper after paper overreaches and attempts impossible general claims
We’ll never dig ourselves entirely out of this hole until theory starts to catch up with practice.
Paper after paper overreaches and attempts impossible general claims
Well, we've taken a look and found serious issue in this paper, and shown, once again, that structured generation *improves* evaluation performance!
Well, we've taken a look and found serious issue in this paper, and shown, once again, that structured generation *improves* evaluation performance!
@willkurt.bsky.social provides a rebuttal for a reasonably well known paper which concluded that structured generation with LLMs always resulted in worse performance.
We do not find the same thing.
blog.dottxt.co/say-what-you...
@willkurt.bsky.social provides a rebuttal for a reasonably well known paper which concluded that structured generation with LLMs always resulted in worse performance.
We do not find the same thing.
blog.dottxt.co/say-what-you...