Lightnews — Scholar-powered news

Kristian Muñiz

@krismuniz.com

140 followers 490 following 110 posts

Software developer, designer, and open source enthusiast. 🇵🇷

Technologist turned software engineer by necessity. Building software products

https://krismuniz.com

Posts Replies Media Videos

Kristian Muñiz

@krismuniz.com

But "The P-LLM cannot write a plan based on data it can’t read" is a substantial impact to the utility of LLMs and central to the prompt injection challenge, no?

If the P-LLM is detached from the data it needs to plan from aren't we back to using an LLM for generating a program that can run LLM(s)?

April 12, 2025 at 4:25 AM

Kristian Muñiz

@krismuniz.com

Absurd decision making, disconnected from reality.

I've followed you for years and know that Google was extremely lucky to have you, any company would be (perhaps your own?).

Regardless of what you do next, I'm sure that as a community we'll continue to follow your work. Please take care!

April 12, 2025 at 2:30 AM

Kristian Muñiz

@krismuniz.com

You should make a business out of that, sounds lucrative 💰

March 30, 2025 at 7:29 PM

Kristian Muñiz

@krismuniz.com

Metaphors are fun though

March 30, 2025 at 5:52 PM

Kristian Muñiz

@krismuniz.com

I found a modern version of this dropoverapp.com

Dropover - Easier Drag and Drop on your Mac.

Dropover is a drag and drop utility that makes it simple to collect, organize, share, and process files with floating shelves.

dropoverapp.com

March 30, 2025 at 5:15 PM

Kristian Muñiz

@krismuniz.com

Yeah drag-and-drop with trackpads can be painful

March 30, 2025 at 5:10 PM

Kristian Muñiz

@krismuniz.com

hahahah I *just* posted a half-baked idea that resembles this in this very thread. Should've read the full conversation

March 30, 2025 at 5:09 PM

Kristian Muñiz

@krismuniz.com

I would argue that there's no right way to do this interaction. It feels unnatural and counterintuitive. I wish I could have a "shelf" I could put dragged items on temporarily while I scroll 😆

March 30, 2025 at 5:08 PM

Kristian Muñiz

@krismuniz.com

Brilliant. Yes!

March 29, 2025 at 4:44 AM

Kristian Muñiz

@krismuniz.com

In your defense, you can't land a pilot either

March 29, 2025 at 1:25 AM

Kristian Muñiz

@krismuniz.com

Ah, hint from Greg Brockman himself. Seems like the "powerful decoder" here is a diffusion model.

March 28, 2025 at 2:05 AM

Kristian Muñiz

@krismuniz.com

Yeah, I read the System Card. It can still be autoregressive sampling. From my observations it still makes mistakes that a diffusion model would make, like omitting details, failing to count, producing garbled text, etc.

March 28, 2025 at 1:50 AM

Kristian Muñiz

@krismuniz.com

Increasingly, large multimodal models are becoming more and more powerful and one of the first ways we can optimize them is by simplifying their I/O and writing powerful, thick encoders/decoders.

March 28, 2025 at 1:09 AM

Kristian Muñiz

@krismuniz.com

*of sampling the next token.

Had to cut some characters.

March 26, 2025 at 6:28 AM

Kristian Muñiz

@krismuniz.com

And it's not structural or semantic consistency, but some information gets lost in the process. Perhaps it's safety mechanisms preventing certain behaviors like using people's likeness.

March 26, 2025 at 6:25 AM

Kristian Muñiz

@krismuniz.com

Could that be a plausible solution? Using GPT-4o to generate initial image representations and passing these representations to a diffusion model component that specializes in creating high-quality, high-resolution visual outputs?

March 26, 2025 at 4:18 AM

Kristian Muñiz

@krismuniz.com

What I know so far, autoregressive models are more expensive to run than diffusion models – of course slower too, latency correlated with cost.

I'm still surprised that resolution is so good. It's almost too good. Could it be a hybrid Transformer + Diffusion approach?

March 26, 2025 at 4:12 AM

Kristian Muñiz

@krismuniz.com

Wow, this is just so much better than what's out there, especially for prompt adherence. Aesthetically, I'm seeing a bit of a bias, but it could very well be deliberate.

March 25, 2025 at 9:52 PM

Kristian Muñiz

@krismuniz.com

Goddammit 🤦🏻‍♂️ right, that's the whole point of this update

March 25, 2025 at 9:47 PM

Kristian Muñiz

@krismuniz.com

By image output I mean sampling tokens that get decoded into rasterised bitmaps. There's some vectorial quality to the generated images.

March 25, 2025 at 9:42 PM

Kristian Muñiz

@krismuniz.com

I have a feeling, completely unproven, that this is more than just image output. The infographics are so crisp, it feels like there's some sort of very powerful generative layout engine powering this. Either that or I completely had the wrong intuition about diffusion models.

March 25, 2025 at 9:32 PM

Kristian Muñiz

@krismuniz.com

lmao

March 22, 2025 at 3:57 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news