1️⃣ L1/L2/no regularization destroys perfectly generalizing solutions across 6 formal-language tasks
2️⃣ MDL keeps or improves on these same perfect solutions
(6/7)
1️⃣ L1/L2/no regularization destroys perfectly generalizing solutions across 6 formal-language tasks
2️⃣ MDL keeps or improves on these same perfect solutions
(6/7)
Tiny weight, full memorization.
(3/7)
Tiny weight, full memorization.
(3/7)