Federico Dominguez
federicodm.bsky.social
Federico Dominguez
@federicodm.bsky.social
MSCAPP @UChicago | Economics @ITAM_mx 🇲🇽

Data Scientist / Engineer @ Energy and Environment Lab | Hip Hop Dancer
Reposted by Federico Dominguez
New paper from Vargas and colleagues theorizes & measures "Academic Copaganda": "studies contesting social movement claims by authors (1) masking their conflicts of interest, or (2) espousing police epistemology."

www.cambridge.org/core/journal...
February 3, 2025 at 4:23 PM
Nice 😅
#databs
April 25, 2025 at 6:12 PM
Every once in a while I bump into some good old RegEx #databs
March 18, 2025 at 9:18 PM
Reposted by Federico Dominguez
and that’s why I’m working on a Saturday morning 🫠
February 22, 2025 at 7:21 PM
My life at work right now...
#databs
February 16, 2025 at 8:17 PM
Reposted by Federico Dominguez
You have now completed January 2025.
February 1, 2025 at 5:13 PM
One more day of data cleaning in this politically troubled world #databs
January 29, 2025 at 8:17 PM
Reposted by Federico Dominguez
I got laid off suddenly and I'm looking for a new gig — remote or Denver based data scientist/ML engineer roles

I'm good enough at coding to not break prod and good enough at humans that I used to be a therapist

Reach out via the email on the resume I've linked below or DM with leads, thank you!!
January 21, 2025 at 5:29 PM
C'mon Spotify, let me sort the album by number of plays (!!!) #databs
January 21, 2025 at 4:33 PM
I used to know several Pandas and SQL functions by heart... now they're just gone from my brain #databs
January 14, 2025 at 8:19 PM
Reposted by Federico Dominguez
Good list, would be good to hear additions from #dataBS, I would add "what are embeddings" by @vickiboykis.com -> www.latent.space/p/2025-papers
The 2025 AI Engineering Reading List
We picked 50 paper/models/blogs across 10 fields in AI Eng: LLMs, Benchmarks, Prompting, RAG, Agents, CodeGen, Vision, Voice, Diffusion, Finetuning. If you're starting from scratch, start here.
www.latent.space
January 2, 2025 at 7:42 PM
Data To-do list for 2025:

- Learn an orchestration tool (currently choosing between dbt / dagster)
- Learn RAG
- Learn LLM fine-tuning
- Learn another language (I’m choosing between Rust and Go)
- Learn basic Computer Vision

Sounds pretty unrealistic, but oh well 😆

#databs
December 31, 2024 at 8:02 PM
Reposted by Federico Dominguez
I learned A TON from running End to End Machine Learning for 6 years. I've tried to capture some of it here.

More than anything I'm grateful to all of you who were a part of it. You were the heart.

www.brandonrohrer.com/e2eml_lessons
December 31, 2024 at 3:43 AM
My reaction when I open Twitter and see nonsense immigration discussions
a man in a striped shirt is holding a cup
ALT: a man in a striped shirt is holding a cup
media.tenor.com
December 28, 2024 at 7:15 PM
Reposted by Federico Dominguez
Ecosystem of 350k models and datasets on @hf.co's hub, connected to other models or datasets they were derived from (where that information exists, which is not the case for another 1M entities) #databs public.graphext.com/d9275e17cc44...
Color, filter and search within each of the 30 variables...
December 22, 2024 at 1:57 PM
Me reading AWS' documentation whenever I try to do something with boto3 #databs 🫠
a close up of a piece of paper that says ' i ' on it
ALT: a close up of a piece of paper that says ' i ' on it
media.tenor.com
December 13, 2024 at 3:32 AM
Reposted by Federico Dominguez
I feel seen.

(h/t to @forrestbrazeal)
December 11, 2024 at 12:00 AM
Reposted by Federico Dominguez
When my PR reviewer tells me I should use this architecture or that software pattern

#dataBS
December 8, 2024 at 10:36 PM
Beginning my transition from pandas to polars… 🐼 —> 🐻‍❄️

Will likely be a pain, but the faster execution times may be worth it #databs
December 9, 2024 at 6:10 PM
Discovering that there's a Dark Theme in AWS made my day brighter 🤩 #databs
December 7, 2024 at 3:27 AM
Reposted by Federico Dominguez
Retrieval-Augmented Generation: Clearly explained 🔥

#LLMs #datasky #mlsky
December 6, 2024 at 9:42 PM
Currently refactoring a codebase that was pretty ugly (which I had done 😅)
December 2, 2024 at 5:43 PM
Reposted by Federico Dominguez
Data is the most fundamental requirement to build a good machine learning system.

Common challenges:

- Not having enough of it
- Poor-quality
- Nonrepresentative training data
- Not enough relevant features
- Too many irrelevant features
- Low diversity

Garbage in, garbage out.
November 30, 2024 at 2:00 PM