AgFunder VC Head of Eng, Ex-MSFT, Waterloo computer engineering
Sunshine Coast BC Canada
A few thoughts with @jakeabeck.bsky.social, @alexgoldie.bsky.social and @corneliusbraun.bsky.social
after Rich Sutton's fascinating lecture on his OaK architecture at UofA
A few thoughts with @jakeabeck.bsky.social, @alexgoldie.bsky.social and @corneliusbraun.bsky.social
after Rich Sutton's fascinating lecture on his OaK architecture at UofA
@ox.ac.uk neuroscientist Prof Akam on RL in brains vs machines, dopamine as TD-error (or not), hippocampal replay & Dyna, model-free vs model-based myths, and why ML experts should consider neuroscience careers.
@ox.ac.uk neuroscientist Prof Akam on RL in brains vs machines, dopamine as TD-error (or not), hippocampal replay & Dyna, model-free vs model-based myths, and why ML experts should consider neuroscience careers.
Stefano Albrecht shares the story behind his multi-agent RL textbook, and how DeepFlow AI turns these ideas into action with LLM-powered agents for business automation.
Recorded at
@rldmdublin2025.bsky.social
Stefano Albrecht shares the story behind his multi-agent RL textbook, and how DeepFlow AI turns these ideas into action with LLM-powered agents for business automation.
Recorded at
@rldmdublin2025.bsky.social
Professor Satinder Singh of Google DeepMind and U of Michigan is co-founder of @rldmdublin2025.bsky.social
Here he narrates the origin story of the Reinforcement Learning and Decision Making meeting (not conference).
Professor Satinder Singh of Google DeepMind and U of Michigan is co-founder of @rldmdublin2025.bsky.social
Here he narrates the origin story of the Reinforcement Learning and Decision Making meeting (not conference).
Fees will increase by €100 on April 1st.
Looking forward to seeing you all in June! #RLDM2025
- Claire Bizon Monroc of Inria : WFCRL for Wind Farm Control
Andrew Wagenmaker of @ucberkeleyofficial.bsky.social : Leveraging Simulation to Bridge Sim-to-Real Gap
- @harwiltz.bsky.social of @mila-quebec.bsky.social : Multivariate Distributional RL
(cont)
- Claire Bizon Monroc of Inria : WFCRL for Wind Farm Control
Andrew Wagenmaker of @ucberkeleyofficial.bsky.social : Leveraging Simulation to Bridge Sim-to-Real Gap
- @harwiltz.bsky.social of @mila-quebec.bsky.social : Multivariate Distributional RL
(cont)
E64: NeurIPS 2024 – Posters and Hallways 2
- Jonathan Cook of Oxford: Cultural Accumulation in Reinforcement Learning
- Yifei Zhou of Berkeley AI Research: DigiRL for In-The-Wild Device-Control Agents
- Rory Young of U Glasgow: A Lyapunov Exponent Approach to RL Robustness
(cont'd)
E64: NeurIPS 2024 – Posters and Hallways 2
- Jonathan Cook of Oxford: Cultural Accumulation in Reinforcement Learning
- Yifei Zhou of Berkeley AI Research: DigiRL for In-The-Wild Device-Control Agents
- Rory Young of U Glasgow: A Lyapunov Exponent Approach to RL Robustness
(cont'd)
Jiaheng Hu of UTexas on Unsupervised Skill Discovery for HRL
@skandermoalla.bsky.social of EPFL: Representation and Trust in PPO
Adil Zouitine of IRT Saint Exupery/Hugging Face : Time-Constrained Robust MDPs
Jiaheng Hu of UTexas on Unsupervised Skill Discovery for HRL
@skandermoalla.bsky.social of EPFL: Representation and Trust in PPO
Adil Zouitine of IRT Saint Exupery/Hugging Face : Time-Constrained Robust MDPs
More cool guests incoming...
More cool guests incoming...
How should RL handle non-episodic tasks, like Mars rovers or HPC scheduling? @abhisheknaik96.bsky.social shares insights from his PhD with Rich Sutton, plus how almost every discounted-reward algorithm can be improved by reward centering.
How should RL handle non-episodic tasks, like Mars rovers or HPC scheduling? @abhisheknaik96.bsky.social shares insights from his PhD with Rich Sutton, plus how almost every discounted-reward algorithm can be improved by reward centering.
It is SOTA on every planning benchmark we tried.
In self-play, it goes 20 years between collisions.
It is SOTA on every planning benchmark we tried.
In self-play, it goes 20 years between collisions.
30 min talks on a deeptech topic incl:
- Mat Sci
- LLMs, RL, Agents, AI h/w
- Robotics
- Nanotech
- Energy
- Space
Seeking engaging speakers, ideally from Cali or nearby.
San Jose Convention Center Mon May 5 morning.
Please DM!
30 min talks on a deeptech topic incl:
- Mat Sci
- LLMs, RL, Agents, AI h/w
- Robotics
- Nanotech
- Energy
- Space
Seeking engaging speakers, ideally from Cali or nearby.
San Jose Convention Center Mon May 5 morning.
Please DM!
Usual suspects: training brittleness (over reliance on hyperparameter tuning), bad & slow sims, overemphasis on generality, LLMs dominating discourse, tabula rasa RL is hard
What do RL researchers complain about after hours at the bar? In this "Hot takes" episode, we find out!
Recorded at The Pearl in downtown Vancouver, during the RL meetup after a day of Neurips 2024.
Usual suspects: training brittleness (over reliance on hyperparameter tuning), bad & slow sims, overemphasis on generality, LLMs dominating discourse, tabula rasa RL is hard
What do RL researchers complain about after hours at the bar? In this "Hot takes" episode, we find out!
Recorded at The Pearl in downtown Vancouver, during the RL meetup after a day of Neurips 2024.
What do RL researchers complain about after hours at the bar? In this "Hot takes" episode, we find out!
Recorded at The Pearl in downtown Vancouver, during the RL meetup after a day of Neurips 2024.
podcasts.apple.com/ca/podcast/r...
podcasts.apple.com/ca/podcast/r...