Gilsinia Lopez
banner
lopezgg.bsky.social
Gilsinia Lopez
@lopezgg.bsky.social
Catholic, Indian & Scientist

MSFT: I help phi understand longer context and finetune LLM for domain specific knowledge

(background yellow-rumped warbler from @carlbergstrom.com)
Reposted by Gilsinia Lopez
This article impressed me and gave me great hope for the papacy of Leo XIV.
"The quiet sixty-nine-year-old American, Robert Francis Prevost, friar of the Order of St. Augustine, slipped past the bookmakers and the pundits."

@austeni.bsky.social explains how Robert Prevost became Leo XIV:
www.commonwealmagazine.org/ivereigh-pre...
Bridge Builder
In the final years of Francis's papacy, Robert Prevost became one of the pontiff's closest collaborators. What clues does their bond hold about Leo's pontificate?
www.commonwealmagazine.org
May 22, 2025 at 7:26 PM
Reposted by Gilsinia Lopez
Crack for Latin nerds like me.
So I don't have the energy to redo the entire thread about the papal conclave but I will redo the part about the Latin nerdery: the announcement of a new pope is always in Latin, and uses a very traditional form.
May 8, 2025 at 10:50 PM
Reposted by Gilsinia Lopez
Policy Gradients chapter of RLHF Book is MUCH improved after all the wonderful GRPO discussions in the last few weeks 🥰
(still open to bug reports)
Policy Gradient Algorithms | RLHF Book by Nathan Lambert
The Reinforcement Learning from Human Feedback Book
buff.ly
March 28, 2025 at 2:49 AM
Reposted by Gilsinia Lopez
If you've ever wanted to learn how the transformer architecture in general or latent multi-head attention works, here's an excellent visual explainer: www.youtube.com/watch?v=0VLA...
The Genius of DeepSeek’s 57X Efficiency Boost [MLA]
YouTube video by Welch Labs
www.youtube.com
March 9, 2025 at 5:46 PM
Reposted by Gilsinia Lopez
A Catholic nun was the first U.S. woman to earn a Ph.D. in computer science. — History Facts historyfacts.com/science-indu...
A Catholic nun was the first U.S. woman to earn a Ph.D. in computer science.
Although computer science goes back to the 19th century, the academic field really came into its own in the early 1960s. The first United States graduates with advanced computer science degrees emerge...
historyfacts.com
March 1, 2025 at 5:34 PM
Reposted by Gilsinia Lopez
What is GGUF, Safetensors, PyTorch, ONNX?

In this blog post, let's discover common formats for storing an AI model.

huggingface.co/blog/ngxson/...
Common AI Model Formats
A Blog post by Xuan-Son Nguyen on Hugging Face
huggingface.co
February 27, 2025 at 5:11 PM
Reposted by Gilsinia Lopez
If you ARE an AI, here's a free PDF—have at it www.probabilistic-numerics.org/textbooks/
Probabilistic Numerics | Textbooks
Quantifying Uncertainty in Computation.
www.probabilistic-numerics.org
February 23, 2025 at 4:39 PM
Reposted by Gilsinia Lopez
After 6+ months in the making and over a year of GPU compute, we're excited to release the "Ultra-Scale Playbook": hf.co/spaces/nanot...

A book to learn all about 5D parallelism, ZeRO, CUDA kernels, how/why overlap compute & coms with theory, motivation, interactive plots and 4000+ experiments!
The Ultra-Scale Playbook - a Hugging Face Space by nanotron
The ultimate guide to training LLM on large GPU Clusters
hf.co
February 19, 2025 at 6:10 PM
Reposted by Gilsinia Lopez
This excellent interactive tutorial on misleading data visualizations explores the idea of a "counter chart" — the graph you draw in response to refute a misleading claims

flowingdata.com/projects/dis...
Defense Against Dishonest Charts
This is a guide to protect ourselves and to preserve what is good about turning data into visual things.
flowingdata.com
February 15, 2025 at 6:48 AM
Reposted by Gilsinia Lopez
Best part of this that Luca isn’t highlighting to start is that we trained a way better OLMoE for this too.

All from better annealing and post train. Didn’t need to redo pre training. Goes to show how much potential these models have!

new instruct model: huggingface.co/allenai/OLMo...
February 11, 2025 at 3:16 PM
Reposted by Gilsinia Lopez
"The Tears of Things" by Richard Rohr invites us to explore the wisdom of the Hebrew prophets.

He writes “Power distorts truth, so God plants and develops it at the edge, where the power-hungry least expect it,” inviting us to the “edge of the inside.” tinyurl.com/46z9574r
January 29, 2025 at 3:19 PM
Tutorial on scaling LM with Jax
jax-ml.github.io/scaling-book/
February 7, 2025 at 4:53 AM
Reposted by Gilsinia Lopez
This paper is wild - a Stanford team shows the simplest way to make an open LLM into a reasoning model

They used just 1,000 carefully curated reasoning examples & a trick where if the model tries to stop thinking, they append "Wait" to force it to continue. Near o1 at math. arxiv.org/pdf/2501.19393
February 7, 2025 at 2:53 AM
Reposted by Gilsinia Lopez
o3-mini is really good at writing internal documentation - feed it a codebase, get back a detailed explanation of how specific aspects of it work https://simonwillison.net/2025/Feb/5/o3-mini-documentation/
o3-mini is really good at writing internal documentation
I wanted to refresh my knowledge of how the Datasette permissions system works today. I already have [extensive hand-written documentation](https://docs.datasette.io/en/latest/authentication.html) for that, but I thought it would be interesting to …
simonwillison.net
February 5, 2025 at 6:09 AM
Reposted by Gilsinia Lopez
The main foundation-model-training companies spend a lot on curating their data these days. Whereas it used to be some simple quality filters, it's now a complex multi-stage pipeline. But yeah, no one usually shares statistical bias and variance analyses with their benchmarks.
January 29, 2025 at 2:52 PM
Reposted by Gilsinia Lopez
Aaaaah good timing, published today!

"we introduce Mini Worldlit, a manually curated dataset of 1,192 works of contemporary fiction from 13 countries, representing nine languages"

By @andrewpiper.bsky.social, @dbamman.bsky.social, Christina Han, Jens Bjerring-Hansen, @hoytlong.bsky.social, et al.
January 27, 2025 at 3:40 PM
Reposted by Gilsinia Lopez
If you want to quickly catch up on all the open modeling things (DeepSeek, ModernBERT, etc.), this was a great overview, by @natolambert.bsky.social.

I somehow got into an argument last week with someone who was insisting that all models are industrial blackboxes... and I wish I'd had this on hand.
The latest open artifacts (#6): Reasoning models, China's lead in open-source, and a growing multimodal space
Artifacts log 6 The open LM ecosystem yet again accelerates.
www.interconnects.ai
January 27, 2025 at 3:05 PM
Reposted by Gilsinia Lopez
Including a model with context length of 4M tokens!
Everything that was released passed week in open AI 🤠

> Link to all models, datasets, demos huggingface.co/collections/...
> Text-readable version is here huggingface.co/posts/merve/...
January 17, 2025 at 4:31 PM
Reposted by Gilsinia Lopez
The blog post of the late Felix Hill is powerful. Stress for AI researchers today is real.

I did not know Felix Hill and I am sorry for those who did.
This story is perhaps a reminder for students, postdocs, founders and researchers to take care of their well being.

medium.com/@felixhill/2...
200bn Weights of Responsibility
The Stress of Working in Modern AI
medium.com
January 4, 2025 at 10:44 AM
Reposted by Gilsinia Lopez
Free course on Agents by Hugging Face. We just added a chapter to smol course on agents. Naturally, using smolagents! The course cover these topics:

- Code agents
- Retrieval agents
- Custom functional

If you're building agent applications, this course should help.
January 13, 2025 at 10:00 AM
Visited the exposition of St. Francis Xavier
www.soultravelling.in/blog/know-al...

So, here you go
“For those who believe, no explanation is necessary. For those who do not believe, no explanation is possible.”
January 9, 2025 at 2:08 PM
Oh and an another quote I saw there very dear to me

"The outward adornment of the body should be a reflection of the inner virtue of the soul."
St. Thomas Aquinas' teachings in the Summa Theologica
January 9, 2025 at 2:05 PM
Reposted by Gilsinia Lopez
🚀 With Meta's recent paper replacing tokenization in LLMs with patches 🩹, I figured that it's a great time to revisit how tokenization has evolved over the years using everyone's favourite medium - memes!

Let's take a trip down memory lane!

[1/N]
December 16, 2024 at 5:31 PM