But maybe the pending copyright lawsuits will have major impacts on GenAI.
With this in mind, what will be the next stage of AI development?
Advances have been driven by scaling—bigger compute, bigger data: bigger models. Moreover, larger data also gives a solution to OOD generalization—just increase the train set until everything is in domain!
arxiv.org/abs/2410.03662
!pip install bioscan-dataset
from bioscan_dataset import BIOSCAN5M
ds = BIOSCAN5M("~/Datasets/bioscan-5m", download=True)
- multimodal learning
- fine-grained classification
- hierarchical labelling
- open-world classification/clustering
- semi- and self-supervised learning
arxiv.org/abs/2406.127...