Lightnews — Scholar-powered news

Rahul G. Krishnan

@rahulgk.bsky.social

Theres lots more to do to understand CFT better, and build on it to create better post-training methods to fine-tune large language models.

Reach out to me or Ethan if you're interested in collaborating on this or pushing this idea to new domains and problems!

April 23, 2025 at 10:44 PM

Rahul G. Krishnan

@rahulgk.bsky.social

📖 We’ve also open-sourced OpenMedText, integrating 121K biomedical articles & 29 medical textbooks to push future research in domain-adaptive fine-tuning in biomedicine.

April 23, 2025 at 10:44 PM

Rahul G. Krishnan

@rahulgk.bsky.social

🔧 We "negative" and "adaptive" prompts, confirming that the semantic content of prompts changes and impacts fine-tuning effectiveness.

April 23, 2025 at 10:44 PM

Rahul G. Krishnan

@rahulgk.bsky.social

📊 Results: On medical benchmarks, CFT improves accuracy by ~2.25% over CPT; in finance, it boosts performance by ~4.32%! Importantly, these gains scale effectively with larger models. 📈

Check out Appendix E.1 for preliminary results on GEMINI Flash 1.5M!

April 23, 2025 at 10:44 PM

Rahul G. Krishnan

@rahulgk.bsky.social

🏥 We tested this idea in biomedical (using newly curated OpenMedText dataset of journals & textbooks!) and financial data—CFT significantly outperforms continued pretraining (CPT) and instruction fine-tuning (IFT) in zero-shot settings.

April 23, 2025 at 10:44 PM

Rahul G. Krishnan

@rahulgk.bsky.social

🎓 Instead of using Q&A as in instruction tuning, CFT uses reflective instructions (e.g., "Reflect on how what you will see changes what you know...") motivated by how humans learn.

April 23, 2025 at 10:44 PM

Rahul G. Krishnan

@rahulgk.bsky.social

💡Contextual finetuning (CFT) uses contextual prompts during fine-tuning to adaptively change the semantic understanding that LLMs leverage during the process of learning new information.

April 23, 2025 at 10:44 PM

Rahul G. Krishnan

@rahulgk.bsky.social

🚀 Problem: Language models struggle with rapidly evolving info and context in fields like medicine & finance. We need ways to teach LLMs new information and control how they absorb this knowledge.

🔍 Insight: Why not explain and teach LLMs how to learn?

April 23, 2025 at 10:44 PM

Rahul G. Krishnan

@rahulgk.bsky.social

If it helps, I usually learn something new (either directly or from further digging) about the behavior of markets.

April 21, 2025 at 9:23 PM

Rahul G. Krishnan

@rahulgk.bsky.social

Rocking that @ Gmail address!

January 31, 2025 at 3:48 PM

Rahul G. Krishnan

@rahulgk.bsky.social

I thought about this a bit, I think helping PhD students close the translational gap from research to deployment (in industry or their own startups), particularly if they don't want to go into academia, is one way forward.

December 21, 2024 at 9:07 PM

Rahul G. Krishnan

@rahulgk.bsky.social

Are you around at Neurips? Would love to say hi and catch up!

December 12, 2024 at 6:10 PM

Rahul G. Krishnan

@rahulgk.bsky.social

Finally, if you're interested in understanding how to leverage energy-based normalizing flows, check out Lance's work on Meow (chienfeng-hub.github.io/meow/)

He'll be presenting on Dec. 12, 11:00 AM–2:00 PM at West Ballroom A-D #6403

🧵(7/7)

Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow

chienfeng-hub.github.io

December 11, 2024 at 12:20 AM

Rahul G. Krishnan

@rahulgk.bsky.social

@nikitadhawan.bsky.social developed NATURAL (www.cs.toronto.edu/~nikita/natu...) with @cottascience.bsky.social , Karen & @cmaddis.bsky.social. Its an end-to-end pipeline that starts from raw-text data and ends with a causal (**) effect associated with an intervention.

(**) conditions apply
🧵(6/7)

NATURAL

www.cs.toronto.edu

December 11, 2024 at 12:20 AM

Rahul G. Krishnan

@rahulgk.bsky.social

b] ~Billions of dollars each year are spent on trials to assess interventions.

Can we use crowdsourced data to know which intervention is likely to work ahead of time?

Doing so requires answering a causal question!

But the data to answer this question is locked in unstructured text.

🧵(5/7)

December 11, 2024 at 12:20 AM

Rahul G. Krishnan

@rahulgk.bsky.social

Find Vahid to learn more about in-context causal inference and lots of other cool problems that he spends his time thinking about!

🧵(4/7)

December 11, 2024 at 12:20 AM

Rahul G. Krishnan

@rahulgk.bsky.social

In arxiv.org/abs/2404.07266, Vahid shows how to use offline expert data with unobserved confounding to guide decision making using a nonparametric prior to guide learning policies for bandits, MDPs, and POMDPs.

Thu 12 Dec 4:30 - 7:30 pm PST 📷 West Ballroom A-D Poster #6708

🧵(3/7)

Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity

We study the problem of online sequential decision-making given auxiliary demonstrations from experts who made their decisions based on unobserved contextual information. These demonstrations can be v...

arxiv.org

December 11, 2024 at 12:20 AM

Rahul G. Krishnan

@rahulgk.bsky.social

a] Today, we learn from data and treat it as ground truth -- should we?

A doctor often knows more about their patient than is represented in electronic medical records.

A teacher knows more about their students than what their grades suggest.

🧵(2/7)

December 11, 2024 at 12:20 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news