With cognitive function, reward learning, and some sort of teacher distillation, this system learns a quite robust forehand within 36,000 steps👍
With cognitive function, reward learning, and some sort of teacher distillation, this system learns a quite robust forehand within 36,000 steps👍
- 📄 Blog post: research.google/blog/a-retur...
- 🌐 Github repo: github.com/google-resea...
- 📑 Walk through post: charlieleee.github.io/publication/...
- 🤗 Hugging Face Playground: huggingface.co/spaces/Deren...
#AcademicSky
- 📄 Blog post: research.google/blog/a-retur...
- 🌐 Github repo: github.com/google-resea...
- 📑 Walk through post: charlieleee.github.io/publication/...
- 🤗 Hugging Face Playground: huggingface.co/spaces/Deren...
#AcademicSky