Theory of Deep Learning / Learning of Deep Theory
@sunnytqin.bsky.social , w/ @emalach.bsky.social, Samy Jelassi), we investigate a core question for LLMs: "𝑡𝑜 𝑏𝑎𝑐𝑘𝑡𝑟𝑎𝑐𝑘 𝑜𝑟 𝑛𝑜𝑡 𝑡𝑜 𝑏𝑎𝑐𝑘𝑡𝑟𝑎𝑐𝑘" in two prototypical logic-heavy puzzles: CountDown and Sudoku.
@sunnytqin.bsky.social , w/ @emalach.bsky.social, Samy Jelassi), we investigate a core question for LLMs: "𝑡𝑜 𝑏𝑎𝑐𝑘𝑡𝑟𝑎𝑐𝑘 𝑜𝑟 𝑛𝑜𝑡 𝑡𝑜 𝑏𝑎𝑐𝑘𝑡𝑟𝑎𝑐𝑘" in two prototypical logic-heavy puzzles: CountDown and Sudoku.
Will be presenting a few papers during the week. Ping me if you want to chat!
Will be presenting a few papers during the week. Ping me if you want to chat!
go.bsky.app/2qnppia
go.bsky.app/2qnppia
We propose a methodology to approach these questions by showing that we can predict the performance across datasets and losses with simple shifted power law fits.
We propose a methodology to approach these questions by showing that we can predict the performance across datasets and losses with simple shifted power law fits.