#text-as-data
Buck fucking wild that 8 Democrats voted for this.
November 10, 2025 at 10:05 PM
A new California law could change how all Americans control their data when using the internet, according to experts. Here’s what it does:

1/2
November 7, 2025 at 10:35 PM
Interestingly, younger Americans are significantly more likely to dislike data centers than older ones

and dislike is bipartisan

This from a Heatmap story (paywalled): heatmap.news/energy/data-...
November 10, 2025 at 10:16 PM
US Weekly COVID update: Nov 10, 2025

🔸1 in 162 Actively Infectious
🔸301,000 New Daily Infections
🔸1,990,000 Infections In The Past Week
🔸199,000,000 Infections in 2025
🔸100,000 to 400,000 Weekly Long COVID Cases
🔸600 to 900 Weekly Deaths

Source: pmc19.com/data/
November 9, 2025 at 11:46 PM
David Fincher's THE SOCIAL NETWORK, and Trent Reznor / Atticus Ross's Oscar-winning score, turned 15 years old this fall. Here's some of my artwork for the album, created by manually corrupting images from the film by editing their raw hex data in a text editor, damaging them from the inside. 1/
November 7, 2025 at 8:17 PM
Bluesky developer blog on ‘group-private data’ (ie, private accounts/circles/friendslock):

docs.bsky.app/blog
November 7, 2025 at 5:54 PM
🚀 SynthTextEval, our open-source toolkit for generating and evaluating synthetic text data for high-stakes domains, will be featured at EMNLP 2025 as a system demonstration!

GitHub: github.com/kr-ramesh/sy...
Paper 📝: aclanthology.org/2025.emnlp-d...

#EMNLP2025 #EMNLP #SyntheticData
GitHub - kr-ramesh/synthtexteval: SynthTextEval: A Toolkit for Generating and Evaluating Synthetic Data Across Domains (EMNLP 2025 System Demonstration)
SynthTextEval: A Toolkit for Generating and Evaluating Synthetic Data Across Domains (EMNLP 2025 System Demonstration) - kr-ramesh/synthtexteval
github.com
November 7, 2025 at 12:53 AM
Call for Papers for Current26 in Bengaluru and London is now open!

Share your talks on Data Streaming, Kafka, or the future of AI in production etc with the world's most innovative data community.

Deadline December 22nd 2025

Bengaluru ➡️ https://cnfl.io/49L8V2J
London ➡️ https://cnfl.io/47IBj2Y
November 10, 2025 at 8:00 AM
Can an LLM create a paper on LLM-usage by newspapers as analyzed by @schenior.bsky.social and myself here on BlueSky based on our BlueSky threads and the YouTube video transcripts?

Yes, yes it can. In 15 minutes. Let's submit it and see if it gets accepted.
November 7, 2025 at 7:40 PM
Our Fall Free Store is coming up fast! Shop for clothes, shoes, accessories, menstrual products, and diapers completely free, Sat., Nov 22, 10AM-2PM, at the playground at East Somerville Community School.

We also need volunteers! Sign up link in bio 🥰

Español/Português/Kreyòl Ayisyen 🧵⬇️
November 10, 2025 at 9:16 PM
So what does the DataRater learn? It automatically identifies and down-weights data that aligns with human intuitions of low quality, such as incorrect text encodings, OCR errors, and irrelevant content.
November 6, 2025 at 11:29 AM
Adding the vision checks meant that the old "store in texture" version needed to be upgraded.

The data itself wasn't bad, but storing it on disk took a couple of versions. Best of all, it compresses into basically nothing!

#gamedev #indiegame #unity3d #solodev
November 8, 2025 at 2:22 AM
and, the reason she thinks it works, is, in the end, it's just data. if that data is transmitted audibly, thats great (and hot), but, if that data is transmitted through text, as long as she's able to process it, it works just as well
don't have that problem as much with auditory hypno
November 9, 2025 at 9:36 PM
alt text of review continued
November 9, 2025 at 4:37 AM
Wrestling with unstructured data and need to query it in a structured form? Discover how to bring historical texts to life by using LLMs to structure the text for storing and querying in SurrealDB as knowledge graphs. 👉 sdb.li/4nuv74M
November 5, 2025 at 2:02 PM
N-gram novelty is widely used as a measure of creativity and generalization. But if LLMs produce highly n-gram novel expressions that don’t make sense or sound awkward, should they still be called creative? In a new paper, we investigate how n-gram novelty relates to creativity.
November 4, 2025 at 3:08 PM
if not Ponzi, why Ponzi-shaped?

www.theatlantic.com/technology/2...
November 3, 2025 at 8:14 PM
guardian (UK case):

The judge ruled: “An AI model such as Stable Diffusion which does not store or reproduce any copyright works (and has never done so) is not an ‘infringing copy’.”

www.theguardian.com/media/2025/n...
November 11, 2025 at 2:58 AM
Joe Armstrong spoke very highly of tiddlywiki:

tiddlywiki.com

You can run it as a local file, or as an application.

It serves a very similar niche to OneNote, and I prefer it.

You can run it as a local file, an application, or if you really want to a cloud thing.
November 6, 2025 at 8:24 PM
This is what you will see first about Alli Jackson when you hit her website.

This is not someone talking about public-private partnerships and paying homage to small business. She is prioritizing immigrants, unionization, the environment.
November 5, 2025 at 5:44 PM
@mattseybold.bsky.social's proposal to (among other things) reestablish faculty and students' control over their data & work here is...brilliant, actually?

(Tho it would strip upper admin of their power to decide WHO gets to buy that data...😢)

theamericanvandal.substack.com/p/mamdani-wi...
November 5, 2025 at 9:11 PM
I had SO MUCH FUN making the data science trivia game for positconf Virtual Day this yr, & all I've wanted since then is to do it again! But it was a big lift to make it happen. What if I did daily trivia instead? I could do it as a poll on the data science community discord! #databs #python #rstats
November 4, 2025 at 2:55 PM
US Weekly COVID update: Nov 3, 2025

🔸1 in 209 Actively Infectious
🔸234,000 New Daily Infections
🔸1,680,000 Infections In The Past Week
🔸196,000,000 Infections in 2025
🔸84,000 to 340,000 Weekly Long COVID Cases
🔸500 to 800 Weekly Deaths

Source: pmc19.com/data/
November 3, 2025 at 11:52 PM
Dieser Cell Artikel behandelt auch konkret jenes Paper, auf dessen Methode Herr Guenzel aufbaut.

Ich hätte ja erwartet, ein Deutscher, wie Hern Guenzel, hätte dbzgl. ein bissl mehr Sensibilität. Das wird einem aber anscheinend in den Wirtschaftsstudien ausgetrieben.
November 7, 2025 at 8:57 AM
yes we do recover them 😅 We remain coastal so I think mobile network would be enough (another Q is position accuracy) My doubt is also if cheap luggage tags give position as text, or if they only provide a map, which would be of no use since we want the students to work with the data afterwards...
November 7, 2025 at 12:10 PM