#SimpleStories
探索SimpleStories:打造微型语言模型的合成文本数据集新里程碑

https://qian.cx/posts/43E55E29-A015-4616-8B84-AE757793E819
May 28, 2025 at 4:55 AM
arXiv:2504.09184v1 Announce Type: new
Abstract: We present SimpleStories, a large synthetic story dataset in simple language, consisting of 2 million stories each in English and Japanese. Our method employs parametrization of prompts with features at [1/3 of https://arxiv.org/abs/2504.09184v1]
April 15, 2025 at 5:58 AM
SimpleStories:用简单叙事开启精彩生活的艺术

https://qian.cx/posts/864E90CD-9EA3-4AAB-8E3D-20E25762BCA4
April 30, 2025 at 8:30 AM
SimpleStories: Das synthetische Textdatenset für das Training von kleinen Sprachmodellen

https://dasgeld.co/posts/4C17A432-581C-4950-868F-FF837A39DD79
May 28, 2025 at 4:54 AM
Lennart Finke, Thomas Dooms, Mat Allen, Juan Diego Rodriguez, Noa Nabeshima, Dan Braun
Parameterized Synthetic Text Generation with SimpleStories
https://arxiv.org/abs/2504.09184
April 15, 2025 at 11:42 AM
Lennart Finke, Thomas Dooms, Mat Allen, Juan Diego Rodriguez, Noa Nabeshima, Dan Braun: Parameterized Synthetic Text Generation with SimpleStories https://arxiv.org/abs/2504.09184 https://arxiv.org/pdf/2504.09184 https://arxiv.org/html/2504.09184
April 15, 2025 at 5:58 AM
✨ Attending the ICLR SynthData workshop today (April 27)? synthetic-data-iclr.github.io

Lennart Finke (finke.dev) will be presenting our paper "Parameterized Synthetic Text Generation with SimpleStories" at the 11:30-12:30PM poster session!
openreview.net/forum?id=JO8...

More details below!👇
SynthData-ICLR2025
Will Synthetic Data Finally Solve the Data Access Problem? Workshop at ICLR 2025.
synthetic-data-iclr.github.io
April 27, 2025 at 1:13 AM
😂 Me & my dog both turned when mom said “come here!” — coincidence? I think not! 🐶

#MadrasMatinee now streaming on #Tentkotta 🎬✨

👉 Subscribe ▶️ tentkotta.com
✅ Go Legal. Say NO to Piracy 🚫

#FamilyFeels #SimpleStories #MadrasMatineeOnTentkotta #FeelGoodCinema #TentkottaVibes
October 7, 2025 at 10:30 AM
SimpleStories: Как создать захватывающий контент легко и эффективно

https://kripta.biz/posts/7FA9415E-3848-458E-A308-6D628A9A43AB
April 30, 2025 at 8:31 AM
Introducing SimpleStories: A synthetic story dataset and model suite designed for understanding the internals and learning dynamics of LMs. It's an evolution from TinyStories and leverages better LMs for data generation and offers more data diversity.
🧵
May 21, 2025 at 8:42 PM
We present SimpleStories, a new synthetic dataset of children's stories with fine-grained labels (e.g., theme, style, persona, style) which retains the simplicity of TinyStories while being more diverse.

Explore the data here: fi-le.net/simplestories/

Very excited to see what people do with this!
Lennart Finke
Lennart's Homepage
finke.dev
April 27, 2025 at 1:13 AM
Sometimes all it takes is one person saying, “This worked for me.”

And that’s where video changes everything.
🎬 trustvid.co.uk

#TrustVid #ServiceProviders #SimpleStories #CustomerVoices
Home - Authentic, Results-Driven Video Production for Service Providers | TrustVid®
Trust-building video specialists for service providers. Boost your online marketing with our authentic, results-driven video production.
trustvid.co.uk
June 12, 2025 at 9:41 AM