Lightnews — Scholar-powered news

sakanaai.bsky.social

@sakanaai.bsky.social

GPT-5 on Sudoku-Bench 🧩

GPT-5 now leads our Sudoku-Bench leaderboard with 33% solve rate, ~2x the previous best, and is the first LLM to solve a 9x9 modern Sudoku.

Still, 67% of puzzles remain unsolved.

Read more about our update here:
🔗 Blogpost → pub.sakana.ai/sudoku-gpt5/

🧵 Thread 👇

November 11, 2025 at 8:04 AM

Reposted

hardmaru

@hardmaru.bsky.social

Excited to release our new work: Petri Dish Neural Cellular Automata!

pub.sakana.ai/pdnca

We investigate how multi-agent NCAs can develop into artificial life 🦠 exhibiting complex, emergent behaviors like cyclic dynamics, territorial defense, and spontaneous cooperation.

sakanaai.bsky.social @sakanaai.bsky.social · 6d

Introducing Petri Dish Neural Cellular Automata (PD-NCA)

pub.sakana.ai/pdnca/

In this work we explore the role of continual adaptation in artificial life, where the cellular automata in our system do not rely on a fixed set of parameters, but rather learn continuously during the simulation itself.

November 5, 2025 at 12:47 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Introducing Petri Dish Neural Cellular Automata (PD-NCA)

pub.sakana.ai/pdnca/

In this work we explore the role of continual adaptation in artificial life, where the cellular automata in our system do not rely on a fixed set of parameters, but rather learn continuously during the simulation itself.

November 5, 2025 at 12:26 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Sakana AI’s CTO (Llion Jones) says he’s ‘absolutely sick’ of transformers, the tech that powers every major AI model

“You should only do the research that wouldn’t happen if you weren’t doing it.” (Brian Cheung) 🧠💡

venturebeat.com/ai/sakana-ai...

Why one AI lab is betting that research freedom beats million-dollar salaries

Jones's proposed solution is deliberately provocative: Turn up the "explore dial" and openly share findings, even at competitive cost. He acknowledged the irony of his position. "It may sound a little controversial to hear one of the Transformers authors stand on stage and tell you that he's absolutely sick of them, but it's kind of fair enough, right? I've been working on them longer than anyone, with the possible exception of seven people."

At Sakana AI, Jones said he's attempting to recreate that pre-transformer environment, with nature-inspired research and minimal pressure to chase publications or compete directly with rivals. He offered researchers a mantra from engineer Brian Cheung: "You should only do the research that wouldn't happen if you weren't doing it."

One example is Sakana's "continuous thought machine," which incorporates brain-like synchronization into neural networks. An employee who pitched the idea told Jones he would have faced skepticism and pressure not to waste time at previous employers or academic positions. At Sakana, Jones gave him a week to explore. The project became successful enough to be spotlighted at NeurIPS, a major AI conference.

Jones even suggested that freedom beats compensation in recruiting. "It's a really, really good way of getting talent," he said of the exploratory environment. "Think about it, talented, intelligent people, ambitious people, will naturally seek out this kind of environment."

October 23, 2025 at 5:30 PM

Reposted

Sung Kim

@sungkim.bsky.social

Use a LLM to create a new constructed language (ConLang) like Klingon, Vulcan, etc.. where an LLM designs phonology, builds grammar, generates a lexicon, creates orthography, and even writes a mini grammar book.

IASC: Interactive Agentic System for ConLangs

October 11, 2025 at 1:55 AM

sakanaai.bsky.social

@sakanaai.bsky.social

IASC: Interactive Agentic System for ConLangs

arxiv.org/abs/2510.07591

If you’re a fan of science fiction or fantasy, you’ve probably heard of made-up languages like Elvish from “The Lord of the Rings” or Klingon from “Star Trek.”

Can LLM agents create new artificial languages?

IASC: Interactive Agentic System for ConLangs

We present a system that uses LLMs as a tool in the development of Constructed Languages. The system is modular in that one first creates a target phonology for the language using an agentic approach ...

arxiv.org

October 10, 2025 at 4:54 AM

sakanaai.bsky.social

@sakanaai.bsky.social

We’re excited to introduce ShinkaEvolve: An open-source framework that evolves programs for scientific discovery with unprecedented sample-efficiency. It leverages LLMs to find state-of-the-art solutions, orders of magnitude faster!

Blog: sakana.ai/shinka-evolve/
Paper: arxiv.org/abs/2509.19349

September 25, 2025 at 5:56 AM

sakanaai.bsky.social

@sakanaai.bsky.social

How Sakana AI’s new evolutionary algorithm builds powerful AI models without expensive retraining
venturebeat.com/ai/how-sakan...

How Sakana AI’s new evolutionary algorithm builds powerful AI models without expensive retraining

M2N2 is a model merging technique that creates powerful multi-skilled agents without the high cost and data needs of retraining.

venturebeat.com

August 30, 2025 at 3:00 AM

sakanaai.bsky.social

@sakanaai.bsky.social

We are honored that Sakana AI’s CEO David Ha (@hardmaru.bsky.social) has been named to the TIME 100 AI 2025 list. Full List: time.com/time100ai

We’re truly grateful for the recognition and will continue our mission to build a frontier AI company in Japan.

Thank you for your support!

August 29, 2025 at 1:21 PM

sakanaai.bsky.social

@sakanaai.bsky.social

What if we could evolve AI models like organisms, letting them compete, mate, and combine their strengths to produce ever-fitter offspring?

Excited to share our new paper, “Competition and Attraction Improve Model Fusion” presented at GECCO 2025 (runner-up for best paper)!

arxiv.org/abs/2508.16204

Competition and Attraction Improve Model Fusion

Model merging is a powerful technique for integrating the specialized knowledge of multiple machine learning models into a single model. However, existing methods require manually partitioning model parameters into fixed groups for merging, which restricts the exploration of potential combinations and limits performance. To overcome these limitations, we propose M2N2, an evolutionary algorithm with three key features: 1/ dynamic adjustment of merging boundaries to progressively explore a broader range of parameter combinations; 2/ a diversity preservation mechanism inspired by the competition for resources in nature, to maintain a population of diverse, high-performing models that are particularly well-suited for merging; and 3/ a heuristic-based attraction metric to identify the most promising pairs of models for fusion. Our experimental results demonstrate, for the first time, that model merging can be used to evolve models entirely from scratch. Specifically, we apply M2N2 to evolve MNIST classifiers from scratch and achieve performance comparable to CMA-ES, while being computationally more efficient. Furthermore, M2N2 scales to merge specialized language and image generation models, achieving state-of-the-art performance. Notably, it preserves crucial model capabilities beyond those explicitly optimized by the fitness function, highlighting its robustness and versatility.

August 25, 2025 at 2:48 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Sakana AI が募集しているSoftware Engineerの募集要項（Job Description）をアップデートしました。

sakana.ai/careers/#sof...

Sakana AIにおけるSoftware Engineerは、Applied Teamの一員としてビジネスのインパクトにつながるプロダクト開発を行っています。Frontend、Backend、Infrastructure構築の全体にわたって、AI技術を組み込んだアプリケーションの設計・開発に挑戦いただける方のご応募をお待ちしております！

August 22, 2025 at 5:10 AM

sakanaai.bsky.social

@sakanaai.bsky.social

８/７に、Sakana AIは初となるApplied Research Engineer向けのOpen Houseを開催しました。現地で70名、オンラインで200名超の方にご参加いただいた本イベントのレポートを公開します。

sakana.ai/open-house-2...

イベントでは共同創業者2名も登壇し、研究開発とビジネスの両輪をどう回し、日本や世界の産業・コミュニティにどう貢献していくかを語りました。また、現場で活躍するAppliedチームのメンバーが、チームの特徴や働き方、AIエージェント開発の実態、Researchチームとの連携などについて紹介しました。

August 14, 2025 at 12:46 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Coverage of Darwin Gödel Machine and The AI Scientist in MIT Technology Review article. @technologyreview.com
www.technologyreview.com/2025/08/06/1...

Five ways that AI is learning to improve itself

From coding to hardware, LLMs are speeding up research progress in artificial intelligence. It could be the most important trend in AI today.

www.technologyreview.com

August 9, 2025 at 2:37 AM

sakanaai.bsky.social

@sakanaai.bsky.social

【UI/UXデザイナー募集】

Sakana AIでは、当社AI技術の社会実装のフェーズに進むことに伴い、一人目のUI/UXデザイナーを募集します。

詳細： sakana.ai/careers/#uiu...

お任せしたいのはプロダクトのコンセプト設計から、プロトタイプの作成、ユーザーテストまでの全てのプロセス。AIによる価値実現に向け、目下成長するApplied Teamの一員として、Sakana AIのプロダクトづくりに挑んでくださる、意欲ある方のご応募をお待ちしています！

August 4, 2025 at 6:35 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Kenneth Stanley & Joel Lehmanによる名著『Why Greatness Cannot Be Planned』の日本語版がBNN社より刊行されました！

『目標という幻想：未知なる成果をもたらす、〈オープンエンド〉なアプローチ』

監修：岡瑞起、翻訳：牧尾晴喜、解説：岡瑞起・鈴木健
本書は、科学・技術・芸術・ビジネスなど、あらゆる領域でブレークスルーを起こすための「目標を定めない」オープンエンドなアプローチを提唱しています。

『WIRED JAPAN』日本版にて、『目標という幻想』日本語版解説が全文公開されました。
wired.jp/article/why-...

偉大なことは計画できない──『目標という幻想』日本語版解説

現在のAI開発にも影響を与えた注目書『目標という幻想──未知なる成果をもたらす、〈オープンエンド〉なアプローチ』から、岡瑞起と鈴木健による解説をお届けする。

wired.jp

July 30, 2025 at 12:29 AM

sakanaai.bsky.social

@sakanaai.bsky.social

「Sakana AIは学術研究のイメージが強いけど、どうやってそれをビジネスにつなげるの？」最先端AIの社会実装に挑む「Applied Team」インタビュー！
sakana.ai/applied-team...

Sakana AIでは、世界トップレベルの生成AI技術を社会実装するために「Applied Team」を本格始動しています。 Applied Teamについて知っていただくことを目的として、AI研究の社会実装に挑む二人のメンバーのインタビュー記事を公開しました。

「事業専門性とR&Dの強みが社内に揃っているスタートアップの環境は、世界で見ても非常に珍しいのではないかと思います。」

July 29, 2025 at 9:32 AM

sakanaai.bsky.social

@sakanaai.bsky.social

【Sakana AIエンジニアの著書刊行🎉】

Sakana AIのApplied Research Engineer、太田真人が共著者を務める『現場で活用するための AIエージェント実践入門』（講談社）が刊行されました。進歩を続けるAIエージェント技術を実践に繋げるための知見が満載ですので、ぜひご覧ください！

Amazon: www.amazon.co.jp/dp/4065401402/

8/7開催のApplied Engineer Open Houseには太田も登壇します。ご参加お待ちしています！

Event: connpass.com/event/362760/

July 18, 2025 at 8:27 AM

sakanaai.bsky.social

@sakanaai.bsky.social

翻訳の良さを多元的に評価する：「TransEvalnia」公開

論文: arxiv.org/abs/2507.12724
GitHub: github.com/SakanaAI/Tra...

Sakana AIはリーズニングを用いて翻訳の多次元的な評価とランキングを行う、プロンプトベースの翻訳評価・ランキングシステム「TransEvalnia」を公開しました。

本システムは翻訳品質評価フレームワークであるMultidimensional Quality Metricの一部に基づいて詳細な評価を行い、どの翻訳が最適かの判断や、様々な評価軸ないし翻訳全体の良さに関する数値スコアを出力します。

July 18, 2025 at 5:25 AM

sakanaai.bsky.social

@sakanaai.bsky.social

TransEvalnia: Reasoning-based Evaluation and Ranking of Translations arxiv.org/abs/2507.12724

By Richard Sproat, Tianyu Zhao, Llion Jones

We are happy to release TransEvalnia, a prompting-based translation evaluation and ranking system that uses reasoning in performing its evaluations and ranking.

July 18, 2025 at 5:00 AM

sakanaai.bsky.social

@sakanaai.bsky.social

8月7日18時からSakana AI初のApplied Engineer Open Houseを開催します！

Sakana AIのApplied Teamのメンバーが業務についてやSakana AIで働く魅力についてお話しします。会場参加（抽選）または、オンライン参加が可能です。

connpass.com/event/362760/

Connpassからの参加登録をお待ちしております！

July 17, 2025 at 6:27 AM

Reposted

Techmeme

@techmeme.com

Tokyo-based Sakana AI details a new Monte Carlo tree search-based technique that lets multiple LLMs cooperate on a single task, outperforming individual models (Ben Dickson/VentureBeat)

Main Link | Techmeme Permalink

July 7, 2025 at 12:50 AM

Reposted

Techmeme Chatter

@chatter.techmeme.com

This post appeared under this Techmeme headline:

sakanaai.bsky.social @sakanaai.bsky.social · Jul 1

We’re excited to introduce AB-MCTS!

Our new inference-time scaling algorithm enables collective intelligence for AI by allowing multiple frontier models (like Gemini 2.5 Pro, o4-mini, DeepSeek-R1-0528) to cooperate.

Blog: sakana.ai/ab-mcts
Paper: arxiv.org/abs/2503.04412

July 7, 2025 at 12:52 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Sakana AI’s TreeQuest: Deploy multi-model teams that outperform individual LLMs (VentureBeat)
venturebeat.com/ai/sakana-ai...

Sakana AI’s TreeQuest: Deploy multi-model teams that outperform individual LLMs by 30%

Sakana AI's new inference-time scaling technique uses Monte-Carlo Tree Search to orchestrate multiple LLMs to collaborate on complex tasks.

venturebeat.com

July 4, 2025 at 1:26 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Sakana AIではApplied Teamの立ち上げを急速に進めており、優秀なApplied Research Engineerを引き続き募集しています🚀

sakana.ai/careers/#app...

正社員だけでなく学生インターンシップも歓迎です✨

金融・保険などのエンタープライズ分野から政府・防衛などの公共分野での業務に興味のある方
最先端のAI技術を実社会に導入してインパクトを出したい方
雇用期間や勤務スタイルの相談もできますのでぜひご応募ください！

July 3, 2025 at 4:10 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

arxiv.org/abs/2503.04412

Visual comparison of AB-MCTS vs. baselines. Unlike baselines that are purely wide (repeated sampling), purely deep (sequential refinement), or fixed-width (standard MCTS), AB-MCTS dynamically decides whether to branch outward or drill down, unifying both search directions.

July 3, 2025 at 12:41 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news