🐘
banner
pkydrm.bsky.social
🐘
@pkydrm.bsky.social
research scientist @MosaicML x @Databricks re: rlhf, humans in the loop, and figuring out what it means to have a good model 🤖🧑‍🎨✨
Reposted by 🐘
What are your favorite recent papers on using LMs for annotation (especially in a loop with human annotators), synthetic data for task-specific prediction, active learning, and similar?

Looking for practical methods for settings where human annotations are costly.

A few examples in thread ↴
July 23, 2025 at 8:10 AM
July 21, 2025 at 2:20 PM
Reposted by 🐘
I am once again pitching my romantic comedy:

- two academics start dating
- discover they are each other's terrible reviewer
- hijinks ensue

Working title: Love is Double-Blind
June 18, 2025 at 10:55 AM
i wish i could shout this from the rooftops. relatedly, there's no need for robots to be limited by the human form.

similar/tangential thing came up in the 2010s with respect to self-driving: just because people only sense using their eyes doesn't mean cars have to only use cameras!
A sensible perspective on humanoids in manufacturing (TLDR: if you can make humanoids, you can probably make better, more manufacturing specific things)
blog.spec.tech/p/humanoid-r...
Humanoid Robots in Manufacturing
Or, there's a reason we don't pull cars with mechanical horses
blog.spec.tech
April 9, 2025 at 3:47 PM
Reposted by 🐘
The Wikimedia Foundation, which owns Wikipedia, says its bandwidth costs have gone up 50% since Jan 2024 — a rise they attribute to AI crawlers.

AI companies are killing the open web by stealing visitors from the sources of information and making them pay for the privilege
April 2, 2025 at 9:12 AM
we are living in an empirical world and we are empirical girls
A more technical white paper is coming but I learned lots during this process not least of which is that the vast vast majority of RLXF papers over the last couple years are useless. Many assumptions made esp at small scales are simply wrong at larger scales
March 25, 2025 at 8:39 PM
No labels, no problem! I am so excited for this release. We have been working on it for many months, and it's motivated by a common customer roadblock: insufficient labeled examples.
The hardest part about finetuning is that people don't have labeled data. Today, @databricks.bsky.social introduced TAO, a new finetuning method that only needs inputs, no labels necessary. Best of all, it actually beats supervised finetuning on labeled data. www.databricks.com/blog/tao-usi...
TAO: Using test-time compute to train efficient LLMs without labeled data
LIFT fine-tunes LLMs without labels using reinforcement learning, boosting performance on enterprise tasks.
www.databricks.com
March 25, 2025 at 8:39 PM
has anyone successfully gotten very involved with their local library system and, if so, how does one do so?

i know there are volunteer opportunities and it is my dream to one day organize a crafting circle, but i'm talking about how the library actually organizes / functions / prioritizes things!
January 22, 2025 at 8:42 PM
🧵 Super proud to finally share this work I led last quarter - the
@databricks.bsky.social Domain Intelligence Benchmark Suite (DIBS)! TL;DR: Academic benchmarks ≠ real performance and domain intelligence > general capabilities for enterprise tasks. 1/3
December 19, 2024 at 4:25 PM
very demure, very mindful, very 2019-era mujoco humanoid learning to walk
here's a Sora generated video of gymnastics
December 12, 2024 at 2:00 PM
"technology built to address people's needs" is the north star.

side note: it would be amazing to see this attitude in the physical, embodied world as well. it's amazing to see how older adults in dense, walkable areas have such different lifestyles than those in car-centric suburbs.
Would love a focus on systems that help older people!!
December 12, 2024 at 1:33 PM
this is incredible research, and beautiful. would love to know more about what it's like to meaningfully interact with genie 2, or similar models, e.g. to modify the outputs of such a model in the service of a design vision.
Genie 2 can also turbocharge environment design for humans, making it possible to step in and play from concept art 🎨, such as the beautiful work below from one of our rockstar designers.
December 5, 2024 at 7:31 PM
Reposted by 🐘
November 24, 2024 at 3:35 PM
i often talk about the importance of aligning both the magnitude AND direction of a workstream vector. 1/5
November 26, 2024 at 2:09 PM
i do not study this, but i did just finish reading the anxious generation and so i'm very grateful that there are so many people who do indeed study such important things!
A start on who to follow for the science on social media and adolescent mental health! Who else is here?

go.bsky.app/2PqckAy
November 22, 2024 at 12:51 AM
Reposted by 🐘
When you fail to parse your data that’s a jsonl
November 22, 2024 at 12:47 AM
😙🤌
November 19, 2024 at 4:51 PM
that's just a growth mindset!
Tired: imposter syndrome
Wired: embracing incompetence

#academicSky #shitacademicssay
November 18, 2024 at 12:23 AM
svg claude is giving ms paint in the best way
November 15, 2024 at 3:03 AM
Reposted by 🐘
November 13, 2024 at 10:25 AM