Also cooking (Pâté en croûte maker) and slowly learning guitar.
Alth.fr
@althcuisine on Instagram
@AlthCuisine on YouTube
FR/EN
J’ai vu les mêmes pub sur Instagram et vu le prix j’avoue que j’étais un peu refroidi… j’ai cru à une arnaque
J’ai vu les mêmes pub sur Instagram et vu le prix j’avoue que j’étais un peu refroidi… j’ai cru à une arnaque
While there is a point in fixing the generated tokens, we do squash enormous amount of information by actually looking if the cat is dead or alive.
AFAIK, the issue with diff is fixed context size tho
While there is a point in fixing the generated tokens, we do squash enormous amount of information by actually looking if the cat is dead or alive.
AFAIK, the issue with diff is fixed context size tho
It's currently generic enough to use any generation length in the grpo output generation step, but I guess it would be much more efficient to generate only a context size chunk and use the fact that you have the full logits available...
It's currently generic enough to use any generation length in the grpo output generation step, but I guess it would be much more efficient to generate only a context size chunk and use the fact that you have the full logits available...
I hope it's a reasonable implementation...
Tokenizer and Transformer models are very naive, based on Karpathy's transformer from scratch video. Data is also based on Karpathy's video.
I hope it's a reasonable implementation...
Tokenizer and Transformer models are very naive, based on Karpathy's transformer from scratch video. Data is also based on Karpathy's video.
(Left is base transformer, right is post GRPO)
(Left is base transformer, right is post GRPO)
J’ai bien aimé ce passage aussi « These values are 10ˆ8 times lower than levels authorized by EU (55) (3.10−3 mSv day−1) »
J’ai bien aimé ce passage aussi « These values are 10ˆ8 times lower than levels authorized by EU (55) (3.10−3 mSv day−1) »
Cette conclusion provient de plusieurs types d'analyses combinées (géochimie, granulométrie, minéralogie des argiles, activités des radionucléides et de leur signature isotopique, rétro-trajectoires des masses d’air...)
Source @cnrs.bsky.social INSU : www.insu.cnrs.fr/fr/cnrsinfo/...
Cette conclusion provient de plusieurs types d'analyses combinées (géochimie, granulométrie, minéralogie des argiles, activités des radionucléides et de leur signature isotopique, rétro-trajectoires des masses d’air...)
Source @cnrs.bsky.social INSU : www.insu.cnrs.fr/fr/cnrsinfo/...
Et pendant ce temps, obviously, le contexte change, les concurrents avancent, ect…
Et pendant ce temps, obviously, le contexte change, les concurrents avancent, ect…
It is know for quite a bit of time that training data quality is one of the most important factor when working with supervised algorithms, even though the real world data might be noisy.
Isn’t it the same but in the RL environment ?
It is know for quite a bit of time that training data quality is one of the most important factor when working with supervised algorithms, even though the real world data might be noisy.
Isn’t it the same but in the RL environment ?
In my view, it’s more the manifestation of the economical benefit: you are the first, you don’t disclose to keep your advantage. You are not, then open sourcing can hurt the top player.
In my view, it’s more the manifestation of the economical benefit: you are the first, you don’t disclose to keep your advantage. You are not, then open sourcing can hurt the top player.