Pete Werner
pete.penumbra.software
Pete Werner
@pete.penumbra.software
Founder at Penumbra AI
Previously Head of AI at Leonardo, acquired by Canva 2024.
MSc Mathematical and Statistical modeling. AWS Certified Architect.
AI • Art • Music • Yoga • Cycling
Arguably RL has learnt something more general, ie what to do when encountering the plus operator, which can be applied or extrapolated to instances outside its training data.
October 2, 2025 at 2:18 AM
Not familiar with the source that sparked this but take the context of SFT vs RL trying to learn the plus operator. SFT can conceivably rote learn every a + b = c, while RL could learn if a and b are numeric, put the sum after the = symbol.
October 2, 2025 at 2:17 AM
Is it not about being able to validate a candidate response independent of any initial training data.
October 1, 2025 at 11:32 PM
No I hope you talk to someone if you think it might help and are feeling better soon either way
September 19, 2025 at 4:30 AM
It actually sounds like you may be depressed
September 19, 2025 at 2:13 AM
I block a lot of words like prominent names etc. it’s just not a conversation I can meaningfully contribute to or engage with
September 11, 2025 at 5:44 AM
I don’t mind Gemini but they never listen to their customers.
August 26, 2025 at 11:58 PM
Dang that looks good
July 20, 2025 at 10:36 PM
Meme guy dropping truths
July 18, 2025 at 10:39 PM
Tom Clancy
June 29, 2025 at 5:39 AM
Nice!
June 17, 2025 at 8:39 AM