pjain9.bsky.social
@pjain9.bsky.social
Work led by our super predocs: @pranavn1008.bsky.social , @puranjay1412.bsky.social

Joint work with MatCollaborator adityakusupati.bsky.social and @jeffdean.bsky.social! @jeffdean.bsky.social's casual remarks are also loaded with amazing deep ideas :)
pranavn1008.bsky.social
Predoctoral Researcher @Google DeepMind https://pranavajitnair.github.io/
pranavn1008.bsky.social
February 12, 2025 at 11:36 AM
Along with the flexibility, surprising part is that the nesting structure ends up regularizing 2bit model and leads to a much more accurate 2bit model compared the standard QAT approaches.
pranavn1008.bsky.social
Predoctoral Researcher @Google DeepMind https://pranavajitnair.github.io/
pranavn1008.bsky.social
February 12, 2025 at 11:36 AM