Lightnews — Scholar-powered news

c0v1d.bsky.social

@c0v1d.bsky.social

LLM's are pretty terrible judges for many tasks because of their low fidelity. Get a 100 pieces of text and ask it to rate it from ten to one and it will often only assign scores to (9,8,2,and 1) whereas human scores would follow a bell curve. See for yourself in Google sheets w/ Gemini!

November 21, 2025 at 5:25 PM

c0v1d.bsky.social

@c0v1d.bsky.social

Nope your argument clearly resonates with a ton of people. Thanks for sharing!

November 20, 2025 at 4:39 AM

c0v1d.bsky.social

@c0v1d.bsky.social

Fascinating! Forgive my ignorance but could you help me understand the difference between generating text and writing text? I always thought that since final product is same, they are the same. Is it because the process of writing helps the writer learn?

November 20, 2025 at 4:38 AM

c0v1d.bsky.social

@c0v1d.bsky.social

If it isn't, then we can refute their observations. But the setup looks pretty OK to me (as an LLM researcher, not a novelist). And across the board, we're finding out that LLMs are pretty ok at wild tasks. This is the crisis. While I hate reading AI slop, the evidence isn't agreeing with me.

November 19, 2025 at 8:10 PM

c0v1d.bsky.social

@c0v1d.bsky.social

Specifically, How do you evaluate the writing quality of a language model? As a scientific experiment, their "parlor trick" is: ask a human and AI to write something, have it reviewed by an expert, do this N times to establish statistical proof. Now, the question is: is their setup accurate?

November 19, 2025 at 7:58 PM

c0v1d.bsky.social

@c0v1d.bsky.social

I agree with all your points. However, just to provide some context for the "parlor tricks," computer science research requires an epistemic backing for any hypothesis: if your ML model categorizes images, you must collect 10M images and establish statistical proof. We have a crisis in LM research.

November 19, 2025 at 7:52 PM

c0v1d.bsky.social

@c0v1d.bsky.social

Dumb question: how does trim work in a fly-by-wire ship? Is it never needed since you're commanding a delta change rather than physically manipulating the surfaces?

November 18, 2025 at 1:37 PM

c0v1d.bsky.social

@c0v1d.bsky.social

Idiot who only flies gliders here. What's the difference between the C172, 182, 152, etc in terms of the handling characteristics/ etc. I'm too afraid to ask anyone I know offline.

November 9, 2025 at 2:15 AM

c0v1d.bsky.social

@c0v1d.bsky.social

Caturday*

November 8, 2025 at 9:32 PM

c0v1d.bsky.social

@c0v1d.bsky.social

Cute. One thing: plane wings generally have a bit of an upward tilt to help with stability (dihedral). It makes the drawings cuter tho -- as the plane look happier with the wings perked up rather than drooping.

November 8, 2025 at 9:17 PM

c0v1d.bsky.social

@c0v1d.bsky.social

One of the issues with aviation museums is that they need a ton of space for each exhibit. Maybe they're planning on putting something else there with the comet?

November 8, 2025 at 9:08 PM

c0v1d.bsky.social

@c0v1d.bsky.social

So, one of the articles says
> AI investment strives to minimize the necessary labour time of control labour

What is "control labor" here? This is a genuinely fascinating perspective

November 8, 2025 at 9:04 PM

c0v1d.bsky.social

@c0v1d.bsky.social

That's an interesting point, actually. The techbro argument is that this store wouldn't have entered the art-on-bag market in the first place.. so them entering the market is a gain for artists. But, the A.I art is no doubt UGLIER, and it's morally inexcusable that the store made the world UGLIER.

November 8, 2025 at 8:54 PM

c0v1d.bsky.social

@c0v1d.bsky.social

My first doodleings were definitely influenced by the art books I used to flipt through as a child. Will kids now start drawing human doodlings of A.I. Art?

November 8, 2025 at 8:37 PM

c0v1d.bsky.social

@c0v1d.bsky.social

Nah it makes it easier to enumerate data structures. If we have a list of length 10,
`for I in range(len(foos)): print(i)` should print 10 times. But, if we have both side inclusive, it'll print 11 times, which is incorrect.

November 6, 2025 at 12:52 PM

c0v1d.bsky.social

@c0v1d.bsky.social

LLMs are essentially stochastic parrots pre trained on large corpora of books and reddit article.

The optimist in me likes to believe at these people fell in love with humanity and human curiosity. :)

November 6, 2025 at 12:41 PM

c0v1d.bsky.social

@c0v1d.bsky.social

The real problem isn't the heat management (well-ish understood), but rather the radiation hardening needed for robustness that curtails processing power.

November 6, 2025 at 12:38 PM

c0v1d.bsky.social

@c0v1d.bsky.social

Works great until the evolution factor cranks up and you're suddenly looking at behemoth spitters while still struggling to process a single belt of copper ore

November 6, 2025 at 12:20 PM

c0v1d.bsky.social

@c0v1d.bsky.social

Also greatest performance in the post pitch clock era.

November 2, 2025 at 1:03 PM

c0v1d.bsky.social

@c0v1d.bsky.social

Yoshi, Sasaki, and Ohtani's camaraderie throughout this entire season has been a joy to watch. Extremely wholesome.

November 2, 2025 at 1:00 PM

c0v1d.bsky.social

@c0v1d.bsky.social

I agree that blinding trusting LLM outputs is definitely harmful to society but that really isn't the correct way to be using that tool in the first place. Would a different "mode" of operation (e.g. verified coding / information retrieval+ post-hoc verification, etc) alleviate some of the concerns?

October 27, 2025 at 7:26 PM

c0v1d.bsky.social

@c0v1d.bsky.social

Ok after skimming thought the linked paper, the argument revolves around the detrimental effects of using LLMs to produce UNVERIFIED text. What about other frameworks of using LLMs? i.e if the LLM outputs are verified by humans? Seems like it'll be a fun and educational class exercise for students.

October 27, 2025 at 7:14 PM

c0v1d.bsky.social

@c0v1d.bsky.social

Naylor won all our hearts.

October 22, 2025 at 4:12 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news