Robert Nishihara
robertnishihara.bsky.social
Robert Nishihara
@robertnishihara.bsky.social
Co-founder of Anyscale. Co-creator of Ray. Previously PhD ML at Berkeley.
Speak at Ray Summit!
have you ever wanted to speak at an AI conference? :) 📢

The CfP for Ray Summit 2025 San Francisco is open, and we'd love to have you submit a talk on anything and everything ML + AI + distributed computing!

first time speakers are welcome 🫶

www.anyscale.com/blog/ray-sum...
Ray Summit 2025: Call for Proposals Closes on July 14th, 2025 | Anyscale
The Ray Summit 2025 CfP is now open for submissions. The premier AI conference will be held November 3-5 in San Francisco, and the due date for CfPs is July 14th, 2025.
www.anyscale.com
July 10, 2025 at 6:20 AM
DeepSeek released smallpond, a big data processing framework built on top of Ray.
- Smallpond targets high performance data processing.
- It provides a high-level dataframe API
- Targets petabyte-level scaling

The challenges around training data prep only grow when you include multimodal data.
March 4, 2025 at 6:34 AM
Amazon published this only 4 months ago, but it feels like an eternity. It's one of the most impressive large-scale data processing migration efforts. Rare to see companies truly achieving order of magnitude cost improvements (while simultaneously increasing scale).

aws.amazon.com/blogs/openso...
Amazon’s Exabyte-Scale Migration from Apache Spark to Ray on Amazon EC2 | Amazon Web Services
Large-scale, distributed compute framework migrations are not for the faint of heart. There are backwards-compatibility constraints to maintain, performance expectations to meet, scalability limits to...
aws.amazon.com
November 22, 2024 at 3:01 AM
Talked with John Schulman last year about the ChatGPT backstory and scaling laws 😍 John co-founded OpenAI and created ChatGPT. www.youtube.com/watch?v=6Ctv...
ChatGPT Creator John Schulman on OpenAI | Ray Summit 2023
YouTube video by Anyscale
www.youtube.com
November 22, 2024 at 12:50 AM
A good overview of the fundamentals of how to extend context windows for LLMs (if you care about RAG, you probably care about context lengths).
Fine-tuning LLMs for longer context and better RAG systems
Based on the popular “Needle In a Haystack” benchmark and RAG, we share our process of creating a problem-specific fine-tuning dataset to extend the context of models to build better RAG systems.
www.anyscale.com
February 26, 2024 at 6:35 AM