If you're interested in learning more, check out our paper, Data Laundering: arxiv.org/pdf/2412.15255
If you're interested in learning more, check out our paper, Data Laundering: arxiv.org/pdf/2412.15255
We also repeated the distillation process multiple times and found that the performance was maintained
We also repeated the distillation process multiple times and found that the performance was maintained
We first train a model on the GPQA test data, which obviously made this model achieve 100% performance. But hey, don’t many LLMs train on test data anyway?🙈
Then, we train a new model on another (fair) data, but with a distillation loss from the cheating model
We first train a model on the GPQA test data, which obviously made this model achieve 100% performance. But hey, don’t many LLMs train on test data anyway?🙈
Then, we train a new model on another (fair) data, but with a distillation loss from the cheating model