ian
mob.bs
ian
@mob.bs
🌴
Isaac LOL
March 20, 2025 at 5:25 PM
correction, that would be `↑⏎`
March 11, 2025 at 7:51 PM
‘creating a benchmark’ is the first step for this kind of research! companies that are doing this are considerably ahead of ones that aren’t. now they can use these task lists to synthesize a dataset for training - stopping the research after the first evaluation doesn’t seem useful :)
March 8, 2025 at 9:01 PM