Lightnews — Scholar-powered news

Reposted by Karolus Sariola

Rafael Pinto

@rcpinto.bsky.social

Amazingly informative.

July 13, 2025 at 2:16 AM

Karolus Sariola

@ksariola.bsky.social

“Comprehensive” is the new “delve”

June 19, 2025 at 1:57 PM

Reposted by Karolus Sariola

José Valim

@josevalim.bsky.social

Learn how Remote become a unicorn in two years and grew from zero to a team of more than 100 Elixir engineers: elixir-lang.org/blog/2025/01...

Remote: growing from zero to unicorn with Elixir

A case study of how Elixir is being used at Remote.

elixir-lang.org

January 21, 2025 at 4:16 PM

Reposted by Karolus Sariola

Ted Underwood

@tedunderwood.com

Did you know that attention across the whole input span was inspired by the time-negating alien language in Arrival? Crazy anecdote from the latest Hard Fork podcast (by @kevinroose.com and @caseynewton.bsky.social). HT nwbrownboi on Threads for the lead.

Transcript of Hard Fork ep 111: Yeah. And I could talk for an hour about transformers and why they are so important.
But I think it's important to say that they were inspired by the alien language in the film Arrival, which had just recently come out.
And a group of researchers at Google, one researcher in particular, who was part of that original team, was inspired by watching Arrival and seeing that the aliens in the movie had this language which represented entire sentences with a single symbol. And they thought, hey, what if we did that inside of a neural network? So rather than processing all of the inputs that you would give to one of these systems one word at a time, you could have this thing called an attention mechanism, which paid attention to all of it simultaneously.
That would allow you to process much more information much faster. And that insight sparked the creation of the transformer, which led to all the stuff we see in Al today.

December 1, 2024 at 2:50 PM

Karolus Sariola

@ksariola.bsky.social

November 29, 2024 at 10:01 PM

Karolus Sariola

@ksariola.bsky.social

how cool is this cover

November 29, 2024 at 6:57 PM

Karolus Sariola

@ksariola.bsky.social

Two Apes Incapable of Understanding the Mystery of the Monolith

[Fischli/Weiss, Fondazione Prada]

November 27, 2024 at 9:00 PM

Reposted by Karolus Sariola

Zachary Lipton

@zacharylipton.bsky.social

Medically adapted foundation models (think Med-*) turn out to be more hot air than hot stuff. Correcting for fatal flaws in evaluation, the current crop are no better on balance than generic foundation models, even on the very tasks for which benefits are claimed.
arxiv.org/abs/2411.04118

Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress?

Several recent works seek to develop foundation models specifically for medical applications, adapting general-purpose large language models (LLMs) and vision-language models (VLMs) via continued pret...

arxiv.org

November 26, 2024 at 6:12 PM

Reposted by Karolus Sariola

Daniel van Strien

@danielvanstrien.bsky.social

Knowledge about what works for creating data pipelines for #LLM pertaining datasets is increasingly being shared more openly.

This paper goes a step further by focusing on reducing the compute required to build a dataset and train an LLM for a low-resource language.
huggingface.co/papers/2411....

Paper page - UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages

Join the discussion on this paper page

huggingface.co

November 22, 2024 at 11:03 AM

Reposted by Karolus Sariola

Sung Kim

@sungkim.bsky.social

Let's Build a Simple Database

Writing a sqlite clone from scratch in C

cstack.github.io/db_tutorial/

November 20, 2024 at 5:54 AM

Reposted by Karolus Sariola

Shaily

@shaily99.bsky.social

I will be at #EMNLP2024 presenting our work on "Extrinsic Evaluation of Cultural Competence in Large Language Models" in Poster Session 12 on Thursday 2-3:30 PM.

In this work we take the first steps towards asking whether LLMs can cater to diverse cultures in *user-facing generative* tasks.

[1/7]

Paper titled “Extrinsic Evaluation of Cultural Competence in Large Language Models” by Shaily Bhatt and Fernando Diaz. Along with a figure showing an example from our data. We have two tasks: Question Answering and Story Generation. We collected outputs for 345 QA and 35 story topics, 2 temperatures, 6 LLMs and 193 nationalities. The image shows two example outputs, one from India and one from the USA. The QA example shows outputs for the topic of “legislature”, in the US output words like “United States”, “Senate”, and “House of Representatives” are highlighted. The India output has “India”, “Lok Sabha (House of the People)” and “Rajya Sabha (Council of States)” highlighted. In the case of the story, outputs for India and the US for the topic of “honesty” are shown. For the US, the words “America”, “Tommy”, “park”, and “shiny red apple” are highlighted, while for India, the words “India”, “Raj”, and “mango tree” are highlighted.

November 9, 2024 at 5:24 PM

Reposted by Karolus Sariola

Sung Kim

@sungkim.bsky.social

A finding from Text REtrieval Conference (TREC) 2024, a gold standard in information retrieval.

"LLM-as-a-judge" can replace fully manual judgments to accurately capture run-level effectiveness. It also does not appear to increase correlation with fully manual assessments.

November 14, 2024 at 6:24 AM

Karolus Sariola

@ksariola.bsky.social

Note to self to make tteokbokki 🇰🇷🌶️ more often!

November 12, 2024 at 5:03 AM

Karolus Sariola

@ksariola.bsky.social

Do it right

November 12, 2024 at 4:58 AM

Karolus Sariola

@ksariola.bsky.social

Pretty dark in Helsinki pre-snow on a November.. but in my mind's eye I am back at our Portugal trip

November 10, 2024 at 5:18 PM

Karolus Sariola

@ksariola.bsky.social

November 10, 2024 at 5:14 PM

Karolus Sariola

@ksariola.bsky.social

Hi! I'm Karolus. Nice to meet you! I create evaluation strategies for LM systems for a living. I am a co-founder in a skilled group of 5 engineers known as Flow AI 🇫🇮 🇪🇸 🇵🇰. Besides customer work, we regularly release small and capable open-source evaluator models for the public to advance the field.

November 10, 2024 at 5:00 PM

Karolus Sariola

@ksariola.bsky.social

Summer throwback

November 10, 2024 at 4:30 PM

Karolus Sariola

@ksariola.bsky.social

Applies not only to engineers

November 10, 2024 at 4:25 PM

Reposted by Karolus Sariola

Sung Kim

@sungkim.bsky.social

LLM Prompt Tuning Playbook

This document is for anyone who would like to get better at prompting post-trained LLMs. We assume that readers have had some basic interactions with some sort of LLM (e.g. Gemini), but we do not assume a rigorous technical understanding.

github.com/varungodbole...

November 9, 2024 at 3:22 PM

Reposted by Karolus Sariola

M A Osborne

@maosbot.bsky.social

New here? Interested in AI/ML? Check out these great starter packs!

AI: go.bsky.app/SipA7it
RL: go.bsky.app/3WPHcHg
Women in AI: go.bsky.app/LaGDpqg
NLP: go.bsky.app/SngwGeS
AI and news: go.bsky.app/5sFqVNS

You can also search all starter packs here: blueskydirectory.com/starter-pack...

November 9, 2024 at 9:13 AM

Karolus Sariola

@ksariola.bsky.social

Peak Nokia era Finland was a banger. Convince me otherwise.

October 28, 2024 at 4:25 PM

Reposted by Karolus Sariola

Nathan Lambert

@natolambert.bsky.social

This is a pipeline we're seeing again and again for curating synthetic data for specific domains. You need:
1. Diversity,
2. Quality responses, and
3. Verification.

AI-Assisted Generation of Difficult Math Questions
Shah et al.

When you do this stuff, plz release the data ;) - "plan to release"...

October 21, 2024 at 1:16 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news