Lightnews — Scholar-powered news

Ishika Agarwal

@wonderingishika.bsky.social

900 followers 400 following 22 posts

CS PhD @ UIUC | Data Efficiency NLP | Conversational AI | agarwalishika.github.io | same handle on twitter

Posts Replies Media Videos

Ishika Agarwal

@wonderingishika.bsky.social

5/6 Finally, using our influence values, we pick a small subset & fine-tune the model. In our evaluation, we use 4 SOTA influence functions -- NN-CIFT achieves the same performance while using a model 34,000x smaller!

February 17, 2025 at 4:06 AM

Ishika Agarwal

@wonderingishika.bsky.social

4/6 Second, we train the InfluenceNetwork using basic mini-batch gradient descent, then let it estimate the influence for the remaining data. It has a very low error of 0.067!

February 17, 2025 at 4:06 AM

Ishika Agarwal

@wonderingishika.bsky.social

3/6 First, the neural network (called the “InfluenceNetwork”) needs to be trained. We compute influence values using existing methods -- but only for a tiny fraction of data (just 0.25%-5%).

February 17, 2025 at 4:06 AM

Ishika Agarwal

@wonderingishika.bsky.social

🚀Very excited about my new paper!

NN-CIFT slashes data valuation costs by 99% using tiny neural nets (205k params, just 0.0027% of 8B LLMs) while maintaining top-tier performance!

February 17, 2025 at 4:06 AM

Ishika Agarwal

@wonderingishika.bsky.social

3. Continual fine-tuning: given a fine-tuned model, enabling it to integrate new and complementary information while mitigating catastrophic forgetting. We find that reducing the dataset helps remove samples that hinder performance, surpassing the performance of the full dataset.

November 17, 2024 at 7:27 PM

Ishika Agarwal

@wonderingishika.bsky.social

2. Task-specific fine-tuning: given an instruction-tuned model, refining the LLM's expertise in specific domains. We find that pruning the dataset removes noise and keeps relevant examples, achieving better performance than fine-tuning on the full dataset.

November 17, 2024 at 7:27 PM

Ishika Agarwal

@wonderingishika.bsky.social

1. Instruction tuning: given a base model, fine-tuning a model to follow general instructions. We find that performance drops are minimal when reducing the dataset by 70%.

November 17, 2024 at 7:27 PM

Ishika Agarwal

@wonderingishika.bsky.social

I'm so excited to share my latest paper called DELIFT along with Krishnateja Killamsetty, Lucian Popa, and Marina Danilevksy at IBM Research 🎉

We tackle expensive fine-tuning by selecting a small subset of informative data that targets a model's weaknesses.

November 17, 2024 at 7:27 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news