arxiv.org/pdf/2410.03972
It started from a question I kept running into:
When do RNNs trained on the same task converge/diverge in their solutions?
🧵⬇️
arxiv.org/pdf/2410.03972
It started from a question I kept running into:
When do RNNs trained on the same task converge/diverge in their solutions?
🧵⬇️
arxiv.org/pdf/2410.03972
It started from a question I kept running into:
When do RNNs trained on the same task converge/diverge in their solutions?
🧵⬇️