• it starts outputting small embeddings
• around epoch 300 learns an identity function
• takes 1700 epochs more to unwind the data manifold
• it starts outputting small embeddings
• around epoch 300 learns an identity function
• takes 1700 epochs more to unwind the data manifold