Lightnews — Scholar-powered news

Katia Schwerzmann

@katschwerzmann.bsky.social

260 followers 360 following 25 posts

Philosophy, Technology, and the Body—Toward Justice

www.katiaschwerzmann.net

Posts Replies Media Videos

Katia Schwerzmann

@katschwerzmann.bsky.social

Love your music rec!

August 29, 2025 at 8:47 AM

Katia Schwerzmann

@katschwerzmann.bsky.social

Ich hätte Interesse. Ist das Ticket noch zu haben?

May 22, 2025 at 12:45 PM

Katia Schwerzmann

@katschwerzmann.bsky.social

Thank you for building this reading list on AI and eduction!

May 19, 2025 at 9:12 AM

Katia Schwerzmann

@katschwerzmann.bsky.social

Thank you for sharing your reading @bildoperationen.bsky.social. I am glad it resonated. Working at a critique of generative AI in the context of research and education is currently a somewhat lonely endeavor. I hope more researchers will join. We need to tackle this from a plurality of approaches.

May 19, 2025 at 7:00 AM

Katia Schwerzmann

@katschwerzmann.bsky.social

Exciting Alex!!!!

February 24, 2025 at 11:07 AM

Reposted by Katia Schwerzmann

Kulturwissenschaftliches Institut Essen (KWI)

@kwi-essen.bsky.social

The workshop is organized by our Thyssen@KWI Fellow
@katschwerzmann.bsky.social
with talks by @floriansprenger.bsky.social | @alexcampolo.bsky.social |
@rainermuehlhoff.bsky.social | @moritzhiller.bsky.social
Further Information here: www.kulturwissenschaften.de/veranstaltun...

Kulturwissenschaftliches Institut Essen (KWI)

Das Kulturwissenschaftliche Institut Essen (KWI) ist ein interdisziplinäres Forschungskolleg für Geistes- und Kulturwissenschaften in der Tradition internationaler Institutes for Advanced Study. Als i...

www.kulturwissenschaften.de

February 12, 2025 at 11:30 AM

Katia Schwerzmann

@katschwerzmann.bsky.social

This language signals ML community's tendency—or rather desire—to make the human factor, in particular researchers' judgement, evaluation, and labor, disappear from view and from model training. The rule-based component of the reward model is interesting, though.

February 5, 2025 at 2:25 PM

Katia Schwerzmann

@katschwerzmann.bsky.social

One also notices once again the type of naturalizing language that @alexcampolo.bsky.social and I critically analyze in our work. For instance: "During training, DeepSeek-R1-Zero **naturally** emerged with numerous powerful and interesting reasoning behaviors" (p. 3).

February 5, 2025 at 2:25 PM

Katia Schwerzmann

@katschwerzmann.bsky.social

So what exactly is new? That LLMs can do well in math reasoning and coding without relying on supervised learning but still don't do well enough in natural language tasks to not rely on supervised fine-tuning in the end?

February 5, 2025 at 2:25 PM

Katia Schwerzmann

@katschwerzmann.bsky.social

The pure RL phase concerns math and coding problems only. The reward model assesses the base model's solution to "deterministic" math problems through "rule-based verification of correctness," while for coding problem "a compiler can be used to generate feedback based on predefined test cases"(p.6).

February 5, 2025 at 2:25 PM

Katia Schwerzmann

@katschwerzmann.bsky.social

"We create new SFT data through rejection sampling on the RL checkpoint, combined with supervised data from DeepSeek-V3 in domains such as writing, factual QA, and self-cognition, and then retrain the DeepSeek-V3-Base model" (p.2). This seems to be a quite classical fine-tuning process.

February 5, 2025 at 2:25 PM

Katia Schwerzmann

@katschwerzmann.bsky.social

To tackle this issue, the "DeepSeek-V3-Base model" is fine-tuned using the kind of data commonly used in supervised fine-tuning (SFT):

February 5, 2025 at 2:25 PM

Katia Schwerzmann

@katschwerzmann.bsky.social

"Pure RL" means here RL that doesn't rely on supervised learning with annotated data. But then, one reads on p. 2 that "DeepSeek-R1-Zero encounters challenges such as poor readability, and language mixing," which is indeed quite a problem for a large language model.

February 5, 2025 at 2:25 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news