Lightnews — Scholar-powered news

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

Serverless 🦙 @ https://featherless.ai

Build Attention-Killers AI (RWKV) from scratch @ http://wiki.rwkv.com

Also built uilicious & GPU.js (http://gpu.rocks)

Posts Replies Media Videos

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

Damn. Thanks for doing this - was planning to dive into it as well - after the discussion we had that day

Gives more context than I expected. 👍

December 13, 2024 at 8:54 PM

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

Also as O1 style reasoning datasets "comes online" in
the dataset space

We plan to do more training on these new line of QRWKV and LLaMA-RWKV models, over larger context lengths so that they can be true transformer killer

If your @ Neurips, you can find me with an RWKV7 Goose

December 11, 2024 at 9:17 PM

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

This is in addition to our latest candidate RWKV-7 "Goose" 🪿 architecture. Which we are excited for, as it shows early signs of a step jump from our v6 finch 🐤 models.

Which we are scheduled to do a conversion run as well for 32B, and 70B class models

x.com/BlinkDL_AI/s...

December 11, 2024 at 9:17 PM

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

While the Q-RWKV-32B Preview takes the crown as our new Frontier RWKV model, unfortunately this is not trained on our usual 100+ language dataset

For that refer to our other models coming live today. Including a 37B MoE model, and updated 7B
substack.recursal.ai/p/qrwkv6-and...

QRWKV6 and a charm of finches

And how QRWKV6 stands out among our various RWKV6 experiments

substack.recursal.ai

December 11, 2024 at 9:17 PM

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

More details on how we did this soon, but tldr:
- take qwen 32B
- freeze feedforward
- replace QKV attention layers, with RWKV6 layers
- train RWKV layers
- unfreeze feedforward & train all layers

You can try the models on featherless AI today:
featherless.ai/models/recur...

recursal/QRWKV6-32B-Instruct-Preview-v0.1 - Featherless.ai

# recursal/QRWKV6-32B-Instruct-Preview-v0.1 > Compute sponsored by [TensorWave](https://tensorwave.com) - Access MI300X today! QRWKV6 32B Instruct Preview is one of the largest and strongest RWK...

featherless.ai

December 11, 2024 at 9:17 PM

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

This also prove out that the RWKV attention head mechanic works on larger scales

Our 72B model is already "on its way" before the end of the month.

Once we cross that line, RWKV would now be at the scale which meets most Enterprise needs

Release Details:
substack.recursal.ai/p/q-rwkv-6-3...

Q-RWKV-6 32B Instruct Preview

The strongest, and largest RWKV model variant to date: QRWKV6 32B Instruct Preview

substack.recursal.ai

December 11, 2024 at 9:17 PM

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

Why is this important?

With the move to inference time thinking (O1-reasoning, chain-of-thought, etc). There is an increasing need for scalable inference over larger context lengths

The quadratic inference cost scaling of transformer models is ill suited for such long contexts

December 11, 2024 at 9:17 PM

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

You can find out more details here
substack.recursal.ai/p/q-rwkv-6-3...

Considering its 3AM where I am.
I will be napping first before adding more details, and doing a more formal tweet thread

Q-RWKV-6 32B Instruct Preview

The strongest, and largest RWKV model variant to date: QRWKV6 32B Instruct Preview

substack.recursal.ai

December 11, 2024 at 11:14 AM

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

Last neurips I spent more time meeting ppl outside the convention

Then the posters itself: so that’s fair haha 😅

December 9, 2024 at 9:44 PM

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

Honestly even internal org A/B human eval with a handful of people…. is a step up from what most orgs are doing 😭

December 7, 2024 at 7:14 PM

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

Finally done with work meetings and got to chill and slowly experience proper Canadian food:

Poutine and beer at a bar

Discord Quebec gang: Toronto Poutine ain’t real Poutine 🤣

(Will be back in SF tomorrow)

December 6, 2024 at 11:22 PM

PicoCreator - AI Model Builder 🛫 NeurIPS

@picocreator.bsky.social

Clearly. Ain’t enough to make a proper snowball (I tried sadly 🥹)

December 6, 2024 at 10:57 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news