Lightnews — Scholar-powered news

Hailey Collet

@haileystorm.bsky.social

**Using o3 image understanding as key piece in computer control not cost effective though, would want to improve smaller model perf w/ that (and give o3 screenshots in certain situations).

One more caveat: humans require job training, early AGI will require some too

December 20, 2024 at 10:20 PM

Hailey Collet

@haileystorm.bsky.social

You know what, you're right, and I'm sorry.

I know I've been annoyed by your stances but this was obviously me being pretty dumb and unkind, and I'll delete my comment in a bit (make sure to don't miss this one by way of me orphaning it too soon)

December 10, 2024 at 6:37 AM

Hailey Collet

@haileystorm.bsky.social

Ah, well, I just copy-pasted yours 😆 obviously the replacements for calculating Sigma will be an improvement anyway though.

December 8, 2024 at 6:53 PM

Hailey Collet

@haileystorm.bsky.social

Definitely worth trying other sizes, but on my machine w/ torch 2.4 (ROcm 7900XTX), yep!

December 8, 2024 at 6:50 PM

Hailey Collet

@haileystorm.bsky.social

Updated gist with Eugene's O1-pro solution (which is similar but not quite as fast as my solution #2, the fastest for tensor sizes I tested).

December 8, 2024 at 6:31 PM

Hailey Collet

@haileystorm.bsky.social

I updated my gist to include your solution (the one visible in the shared chat): gist.github.com/HaileyStorm/...
Looks like it is an improvement but slightly beaten out (at least for my test tensor sizes) by one of the O1 solutions I got... with a lot more effort.

December 8, 2024 at 6:30 PM

Hailey Collet

@haileystorm.bsky.social

Wow this was a challenge! With some (OK, a painful hour of) guidance, I was able to get a couple good solutions from O1 and QwQ. Largely down to improving calculation of Sigma. Here's a gist with the three solutions, testing run times etc. Roughly 2.9x faster :)
gist.github.com/HaileyStorm/...

December 8, 2024 at 6:24 PM

Hailey Collet

@haileystorm.bsky.social

Afraid i have to disagree. MMLU is a general knowledge benchmark for example and disagrees with you, as do my personal vibes (llama 3.1 8B > Mistral 7B in almost every way).
Fact knowledge ofc has density limit, as does intelligence, but do not agree reached either esp back at Mistral 7B.

December 8, 2024 at 4:36 PM

Hailey Collet

@haileystorm.bsky.social

You bet! Appreciate your videos :)

December 8, 2024 at 4:24 PM

Hailey Collet

@haileystorm.bsky.social

I believe they've removed per message limits, so it's down to context length. Currently 32k tokens for Plus and 128k for Pro.

December 8, 2024 at 8:07 AM

Hailey Collet

@haileystorm.bsky.social

I like Kyle Kabasares pretty well for physics & math. @academisfit.bsky.social

December 8, 2024 at 7:55 AM

Hailey Collet

@haileystorm.bsky.social

DETH, lulz

December 8, 2024 at 3:27 AM

Hailey Collet

@haileystorm.bsky.social

Sonnet is my goto, my all around. Especially for most coding problems.

O1-preview and from what I've seen so far even more so full o1, handles certain challenging tasks Sonnet can't dream of solving.

I use it maybe 10% as much as Sonnet (but, 4o would be fine for 85% of what I do with Sonnet).

December 7, 2024 at 5:48 AM

Hailey Collet

@haileystorm.bsky.social

Will be very interested to see how multimodal o1 handles these

December 2, 2024 at 3:40 AM

Hailey Collet

@haileystorm.bsky.social

Of course, is something is *really* bothering me I talk to both, and if timely my therapist too (she's available for messages but I largely stick to talking in person)

November 29, 2024 at 6:15 PM

Hailey Collet

@haileystorm.bsky.social

It kinda depends. By default Claude but cgpt for personal but more, er, technical things, like how something might be interpreted, and ofc there's advanced voice which is nice for some things. Also cgpt for non therapy type medical stuff.

Claude+o1 for wheel work but that's all code.

November 29, 2024 at 6:13 PM

Hailey Collet

@haileystorm.bsky.social

There are things I discuss with AI I don't discuss with my therapist 😆

November 29, 2024 at 6:01 PM

Hailey Collet

@haileystorm.bsky.social

I *know* he's brilliant, but there's not a single person in the AI sphere that rubs me the wrong way more.

November 29, 2024 at 5:59 PM

Hailey Collet

@haileystorm.bsky.social

I've verified it a little (music generation, expected token pattern error rate & output quality after context len increase during training)

November 26, 2024 at 10:48 PM

Hailey Collet

@haileystorm.bsky.social

I meant wall clock to same loss, since you have to change your model config anyway

November 26, 2024 at 10:08 PM

Hailey Collet

@haileystorm.bsky.social

It's definitely slower wall clock. But while important that's of course not the only metric :)

November 26, 2024 at 9:36 PM

Hailey Collet

@haileystorm.bsky.social

Genuinely awesome

November 26, 2024 at 5:52 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news