Lightnews — Scholar-powered news

Faton

@lauler.bsky.social

1 followers 15 following 3 posts

Posts Replies Media Videos

Faton

@lauler.bsky.social

Semi-related question:

Have you experimented with using LLMs to b) synthetically generate QA pairs based off of source documents, b) generate hard negatives, c) generate paraphrases?

Does it help with performance? Mostly considering this for languages other than English.

March 29, 2025 at 4:48 PM

Faton

@lauler.bsky.social

Any chance your README could include a more detailed guide how to set up pretraining? I.e. being explicit about expected data format, data prep steps, launch commands for distributed training, etc.

Would be very helpful and appreciated.

December 20, 2024 at 12:09 AM

Faton

@lauler.bsky.social

Some questions:

1. What's your intuition regarding possibility of doing sbert model distillation on models trained with query prefixes? Can we just use regular parallel texts, or would we need prefix parallel texts?
2. Matryoshka aware model distillation should be possible right?

December 5, 2024 at 2:59 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news