Ilya says current training methods are hitting a wall because we have used up all the data there was in the public domain. So, there is no more data to scale more and improve the models.
But what if we could unearth new sources of data?
Ilya says current training methods are hitting a wall because we have used up all the data there was in the public domain. So, there is no more data to scale more and improve the models.
But what if we could unearth new sources of data?
Seems like we will slowly move away from “generic models” and get specializations in math, stem, writing etc.
Seems like we will slowly move away from “generic models” and get specializations in math, stem, writing etc.
The goal is to develop a language model or a setup of multiple models that can generate relevant content based on input prompt.
How would you do it?
#ai #llm
Adding new domain knowledge to a small 20M parameter language model.
A fun experiment to understand where LoRA, or full finetuning can work vs where to train models from scratch.
Blog: medium.com/@ankit_94177...
Paper: arxiv.org/abs/2409.17171
Adding new domain knowledge to a small 20M parameter language model.
A fun experiment to understand where LoRA, or full finetuning can work vs where to train models from scratch.
Blog: medium.com/@ankit_94177...
Paper: arxiv.org/abs/2409.17171
And great sound mixing. We can finally hear dialogues over the bg music in a Nolan movie (even if it’s in trailer)
And great sound mixing. We can finally hear dialogues over the bg music in a Nolan movie (even if it’s in trailer)