Build Attention-Killers AI (RWKV) from scratch @ http://wiki.rwkv.com
Also built uilicious & GPU.js (http://gpu.rocks)
the dataset space
We plan to do more training on these new line of QRWKV and LLaMA-RWKV models, over larger context lengths so that they can be true transformer killer
If your @ Neurips, you can find me with an RWKV7 Goose
the dataset space
We plan to do more training on these new line of QRWKV and LLaMA-RWKV models, over larger context lengths so that they can be true transformer killer
If your @ Neurips, you can find me with an RWKV7 Goose
Which we are scheduled to do a conversion run as well for 32B, and 70B class models
x.com/BlinkDL_AI/s...
Which we are scheduled to do a conversion run as well for 32B, and 70B class models
x.com/BlinkDL_AI/s...
With the move to inference time thinking (O1-reasoning, chain-of-thought, etc). There is an increasing need for scalable inference over larger context lengths
The quadratic inference cost scaling of transformer models is ill suited for such long contexts
With the move to inference time thinking (O1-reasoning, chain-of-thought, etc). There is an increasing need for scalable inference over larger context lengths
The quadratic inference cost scaling of transformer models is ill suited for such long contexts
We release QRWKV6-32B-Instruct preview, a model converted from Qwen-32B instruct, trained for several hours on 2 MI300 nodes.
Surpassing all previous known open linear models (StateSpace, Hybrid, etc)
Unlocking 1000x+ lower inference cost
We release QRWKV6-32B-Instruct preview, a model converted from Qwen-32B instruct, trained for several hours on 2 MI300 nodes.
Surpassing all previous known open linear models (StateSpace, Hybrid, etc)
Unlocking 1000x+ lower inference cost
Matching transformer level performance despite the lack of "Quadratic Attention", using RWKV Attention instead
Proving Attention is **not** all you need
Matching transformer level performance despite the lack of "Quadratic Attention", using RWKV Attention instead
Proving Attention is **not** all you need
Poutine and beer at a bar
Discord Quebec gang: Toronto Poutine ain’t real Poutine 🤣
(Will be back in SF tomorrow)
Poutine and beer at a bar
Discord Quebec gang: Toronto Poutine ain’t real Poutine 🤣
(Will be back in SF tomorrow)
Me as South East Asian: oooo… snow ☃️
My Canadian friends: that’s barely any snow ❄️
The true Canadian experience needs to have at least knee high snow I guess 🤣
Me as South East Asian: oooo… snow ☃️
My Canadian friends: that’s barely any snow ❄️
The true Canadian experience needs to have at least knee high snow I guess 🤣
( the circle of internet life )
( the circle of internet life )