Aranym
Aranym
@aranym.bsky.social
Released my newest dataset - 50 million Bluesky posts!

I'll be going from this straight to 100M, so stay tuned!

huggingface.co/datasets/Ara...

#ai #ml #NLP

(P.S. My future datasets will respect Bluesky's AI consent signal. In the meantime, check the dataset for my own data removal contact.)
Aranym/50-million-bluesky-posts · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
December 19, 2024 at 3:11 PM
Releasing a dataset of 40 million Bluesky posts!

Collected using the Firehose API, I hope people do some cool ML with it.

Anonymized with a data removal mechanism and includes text, language predictions, and image data.

#ai #ml #NLP

huggingface.co/datasets/Ara...
Aranym/40-million-bluesky-posts · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
December 17, 2024 at 3:25 PM