Ryan Rosario
banner
datajunkie.bsky.social
Ryan Rosario
@datajunkie.bsky.social
Software Engineer at Google (Kubernetes for AI/ML)
Lecturer at UCLA Computer Science
Statistics Ph.D., UCLA

Machine learning, natural language processing, psychometrics, database systems.

Opinions my own.
Most, if not all, of us who teach and/or do research feel a certain way about what’s going on right now. It was surreal to see UCOP explicitly call it out in a recent (public) document. It made my heart skip a beat.
May 4, 2025 at 6:59 AM
If any of you are thinking of upgrading to Claude Max. Don't. Save your money. Same ridiculous limit on input and conversation length.
April 27, 2025 at 11:13 PM
Whenever I introduce TCP or other network connections, I introduce the concept with two bros, Connor and Logan. Why? Because not much data is exchanged, yet the handshake is important.
April 8, 2025 at 4:04 AM
It's getting to the point that I need to consider canceling my subscription to Claude. Has anyone else noticed a drastic decrease in quality with coding prompts in addition to system reliability issues?
March 25, 2025 at 7:59 PM
Another earthquake! This is getting to become a bit much. All of them have been near Conejo Valley.
March 17, 2025 at 3:18 AM
What baffles me about statistics education is lack of discussion on non-ML importance to computer science:

(1) use for indexing where the keys follow a distribution: arxiv.org/abs/1712.01208
(2) use in evaluating cost of query plans
(3) probabilistic data structures
The Case for Learned Index Structures
Indexes are models: a B-Tree-Index can be seen as a model to map a key to the position of a record within a sorted array, a Hash-Index as a model to map a key to a position of a record within an unsor...
arxiv.org
March 15, 2025 at 9:35 PM
Tonight I took all of my slides and passed them to NotebookLM. The podcast adds more context, some analogies and other examples. With the exception of some minor hallucination, and the host making strange noises, this is mind-blowing. I'm using this for my classes moving forward.
March 15, 2025 at 9:10 AM
THAT was a big earthquake. Damn.
March 9, 2025 at 8:04 PM
Why do I even pay for Claude? It is horribly rate limited, expensive, and is offline more than it is online.
March 6, 2025 at 9:56 AM
Hot take? Tableau is hot garbage.

Believe it or not, today was my first time ever using Tableau as a data scientist. And after today, it is also my last time.
March 3, 2025 at 8:37 AM
MongoDB has the most bizarre authentication model.
March 2, 2025 at 7:42 AM
I am going to have to switch away from Neo4j to another graph database as my choice to teach the graph model. It's too much of a money grab for simple things like read only access on a user, and it's a pain to setup HTTPS and reverse proxy. Any suggestions for worth alternatives?
March 2, 2025 at 2:42 AM
This quarter my data management students are constructing an ETL pipeline as their final project. We are hamstrung by AWS' free tier and so we are using #DuckDB as our serving layer, rather than Snowflake or Redshift, to power a Tableau dashboard. I enjoy it more and more each time.
February 26, 2025 at 9:21 PM
Ultimate nirvana when teaching. This happened for the first time since maybe week 1. Average response time has gone up over the years, but still pretty good if there were an SLA...
February 19, 2025 at 10:52 PM
Brought to you by the parametric equations,
x = 16 (sin x)^3
y = 13 cos x - 5 cos 2x - 2 cos 3x - cos 4x
February 14, 2025 at 10:27 PM
This tweet (I removed the rest) was a discussion today in my databases class. What are some other reasons why SSNs might be duplicated: wrong join condition, looking at a table with SSN as a foreign key, SSN is part of a composite key (before 2011 SSNs in rare cases were not unique) etc.
February 12, 2025 at 4:24 AM
Never use bare metal Mac instances on AWS EC2. They make you reserve a dedicated machine in order to run an instance. Then, you must wait 24 hours before releasing the machine. Why? Because most people forget to release it and the get billed $15/day. That's intentional.
February 5, 2025 at 8:38 AM
I am truly amazed at how much ChatGPT and Claude hallucinate with DuckDB-related questions. It has created its own entire reality about the syntax.
February 3, 2025 at 7:52 AM
I've only dipped my toes into #DuckDB but the output modes alone are amazing!
February 2, 2025 at 8:13 AM
I feel like this could be a big year for me. There is so much I want to do but am being held back by my pattern of life. I just need to find the proper channel to make it happen.
January 25, 2025 at 7:59 AM
Reposted by Ryan Rosario
#rstats and #datascience people from Twitter (or new people here) please follow and I will reciprocate! 📉📊📈😬👍
April 29, 2023 at 5:56 PM
This is quite an interesting move. With Perplexity taking on search, and Gen Z being said to use TikTok as their main method of search, could this be a brilliant move?

www.cnbc.com/2025/01/18/p...
Perplexity AI makes a bid to merge with TikTok U.S.
Perplexity AI submitted a bid on Saturday to TikTok parent ByteDance, proposing that Perplexity merge with TikTok U.S., CNBC has learned.
www.cnbc.com
January 18, 2025 at 11:56 PM
Lithium batteries are going to kill us all.
January 17, 2025 at 6:45 AM
While the circumstances differ, the feeling is the same. When COVID started I felt disillusioned and tired. I feel the same now with all of the destruction from the fires, and the power constantly going on and off. And I am staying inside because signals are out on this highway.
January 15, 2025 at 11:54 PM
Last holiday party of the season. That moment when you put some party music on and the first song is Nickelback. Blasphemy!
January 12, 2025 at 3:21 AM