#deeprl
Do you have specific instructions/workflows for Claude? I just tried this out for the first time last night on some DeepRL material (lecture slides, starter project code) and was pleasantly surprised with how much it helped me learn
August 6, 2025 at 10:40 PM
This is a good research question. It is not entirely clear, but this may be part of why DeepRL struggles to perform well in more interesting and diverse environments (Minecraft, the real world): the similarity between the state distributions during training can be more extensive.
January 10, 2025 at 4:04 PM
Our review on representational spaces in OFC/vmPFC and deepRL is out in Trends in Neuroscience. Was great working with Shany Grossman and @nicoschuck.bsky.social on this!
Happy to share our review on OFC/vmPFC representations in Trends in Neurosciences, written with @nirmoneta.bsky.social and Shany Grossman
www.cell.com/trends/neuro...
Very short thread below to summarize our review
#neuroscience #neuroskyence #compneurosky #PsychSciSky
ScienceDirect.com | Science, health and medical journals, full text articles and books.
kwnsfk27.r.eu-west-1.awstrack.me
November 15, 2024 at 4:37 PM
We tackled this challenge with behavioral experiments in mice, Bayesian theory, and #DeepRL. Using a novel change-detection task, we show how mice and networks adapt on the first trial from a context change by inferring both context and meaning—without trial and error.
February 17, 2025 at 1:43 PM
I am accepting new students to the lab to work on scaling DeepRL algorithms, making generalist robotics models, and ML 4 scientific discovery. Deadline @mila-quebec.bsky.social is Dec 1! Make sure to talk about why you are passionate about these topics.
Tips here: neo-x.github.io/blog/2023/09...
| Glen Berseth
neo-x.github.io
November 28, 2024 at 3:02 PM
previous work on SpikeRL, which is a scalable and energy efficient framework for DeepRL-based SNNs for continuous control. In our initial implementation of SpikeRL framework, we depended on the population encoding from the [5/9 of https://arxiv.org/abs/2502.17496v1]
February 26, 2025 at 5:58 AM
If you are curious about deep reinforcement learning find the compact highlights of my recent papers in this new short piece:

#NeurIPS2024 @neuripsconf.bsky.social #NeurIPS24
#reinforcementlearning #AIsafety #AISecurity #ResponsibleAI #TrustworthyAI #RobustAI #DeepRL

bsky.app/profile/ezgi...
The paper on adversarial non-robustness is now online! This paper highlights what you should now about Robust Reinforcement Learning.

Adversarial Robust Deep Reinforcement Learning is Neither Robust Nor Safe
Link: openreview.net/pdf?id=EPa0u...

#NeurIPS2024
neuripsconf.bsky.social
#NeurIPS24
December 7, 2024 at 12:18 PM
Now that #CreativeProblemSolving is in the limelight, our AIGenC model (🖋️@corinacatarau1 @EAlonso20) may interest you.
Compatible with heat.
https://arxiv.org/abs/2205.09738

#creativity #generalisation #deeprl #ai #reinforcementlearning #HierarchicalRepresentations, #graphs...
April 9, 2025 at 6:01 AM
La Revolución de la IA ya está aquí: del #ReinforcementLearning desde cero hasta la #AGI.
Q-learning, SARSA, LLMs y agentes que aprenden a pensar 🤖✨

📺 Mira el video completo 👉 youtu.be/R6MvIB7DHLU

#InteligenciaArtificial #MachineLearning #DeepRL
La revolución de la IA: Aprendizaje por refuerzo desde cero hasta la Inteligencia Artificial General
YouTube video by En la mente de la máquina, Inteligencia Artificial
youtu.be
August 20, 2025 at 4:11 PM
XQC adds batch-norm, weight-norm and a distributional loss to Soft Actor-Critic, cutting critic condition numbers and improving sample efficiency on benchmarks. Submitted September 2025. Read more: https://getnews.me/xqc-improves-sample-efficiency-in-deep-reinforcement-learning/ #deeprl #xqc #rl
October 1, 2025 at 3:53 AM
MAMC adds multiple actors and critics to deep deterministic RL, surpassing prior models on MuJoCo with higher rewards and faster convergence; the code is released on GitHub. Read more: https://getnews.me/multi-actor-multi-critic-deep-rl-outperforms-state-of-the-art-on-mujoco/ #deeprl #mujoco #mamac
October 3, 2025 at 3:29 AM
Rewardless Learning: Human Proxy-Based Reinforcement (#DeepRL) in Human Environments / @lexfridman.bsky.social

bryantmcgill.blogspot.com/2025/07/rewa...

This investigation was inspired by Lex's (@LexFridman) @MIT 6.S091: Introduction to Deep RL.

Soundcloud:
soundcloud.com/bryantmcgill...
Rewardless Learning: Human Proxy-Based Reinforcement (DeepRL) in Human Environments
Bryant McGill · Rewardless Learning: Human Proxy-Based Reinforcement (DeepRL) in Human Environments This investigation was originally...
bryantmcgill.blogspot.com
July 6, 2025 at 2:54 PM
optimizations that traditional artificial neural networks have. Researchers have addressed this by combining SNNs with Deep Reinforcement Learning (DeepRL), yet scalability remains unexplored. In this paper, we extend our [4/9 of https://arxiv.org/abs/2502.17496v1]
February 26, 2025 at 5:58 AM
Researchers propose attention‑based deep RL (ADRL‑RE) and scenario planning for after‑sales slot scheduling; ADRL‑RE outperforms rule‑based baselines, while SBP needs less compute. Read more: https://getnews.me/deep-rl-improves-dynamic-after-sales-time-slot-scheduling/ #deeprl #scheduling
September 25, 2025 at 3:22 AM
Remarkably, DeepRL networks converged on near-optimal strategies and exhibited the same nontrivial Bayesian-like belief-updating dynamics—despite never being trained on these computations directly. This suggests that inference mechanisms can emerge naturally through reinforcement learning.
February 17, 2025 at 1:43 PM
Survey classifies reinforcement learning for bipedal robots into end-to-end and hierarchical frameworks, urging unified, efficient designs. Updated Sep 27 2025, 17 pages. Read more: https://getnews.me/survey-of-deep-reinforcement-learning-approaches-for-bipedal-robots/ #deeprl #bipedalrobots
October 1, 2025 at 7:45 AM
Painting Peptides with Antimicrobial Potency through Deep Reinforcement Learning [new]
DeepRL unifies AMP opt & gen, enhances known AMPs, evolves act ones frm SPs, & designs new de novo.
January 15, 2025 at 9:59 AM
Mehdi Heydari Shahna, Seyed Adel Alizadeh Kolagar, Jouni Mattila
Integrating DeepRL with Robust Low-Level Control in Robotic Manipulators for Non-Repetitive Reaching Tasks
https://arxiv.org/abs/2402.02551
May 16, 2024 at 4:04 AM

This paper provides the compact highlights of my recent work on generalization, adversarial perspective, robustness and safety in deep reinforcement learning! #NeurIPS2024
@neuripsconf.bsky.social

#ReinforcementLearning #SafeAI #AISafety #TrustworthyAI #ML #DeepRL

bsky.app/profile/ezgi...
The paper on adversarial non-robustness is now online! This paper highlights what you should now about Robust Reinforcement Learning.

Adversarial Robust Deep Reinforcement Learning is Neither Robust Nor Safe
Link: openreview.net/pdf?id=EPa0u...

#NeurIPS2024
neuripsconf.bsky.social
#NeurIPS24
December 13, 2024 at 6:35 PM
I am teaching a class on #FoundationalModels for #robotics and Scaling #DeepRL algorithms. This class expands on last year's class and my generalist robotics policies tutorial and code. I plan to share the lectures and code assignments. Starting with the first lectures below.
January 19, 2025 at 7:14 PM
A hybrid controller blending deep reinforcement learning with bounded extremum seeking showed faster set‑point convergence and resilience in a time‑varying particle‑accelerator component. https://getnews.me/hybrid-drl-and-bounded-es-improves-control-of-variable-systems/ #hybridcontrol #deeprl
October 6, 2025 at 5:31 AM
Researchers introduced UD3RL, a dual‑decoder reinforcement‑learning model for the close‑enough traveling salesman problem, beating classic heuristics in tour quality and speed. Read more: https://getnews.me/deep-rl-tackles-close-enough-traveling-salesman-problem/ #closeenoughtsp #deeprl
October 6, 2025 at 9:58 AM
A deep‑reinforcement‑learning controller that directly drives the gate of a buck converter showed faster settling and reduced overshoot in simulations versus traditional PWM control. https://getnews.me/deep-rl-direct-gate-control-improves-buck-converter-speed/ #buckconverter #deeprl
September 20, 2025 at 12:32 PM
🔍 The History of Reinforcement Learning (Updated for 2025)

From Thorndike’s cat puzzle box 🐱📦 to DeepMind’s AlphaGo 🤖🏆 to DeepSeek-R1 —how did RL become a key AI breakthrough?

📖 Read the full history:
👉 researchdatapod.com/history-rein...

#AI #ReinforcementLearning #DeepSeek #DeepRL #history
a person is playing a game of go on a table
ALT: a person is playing a game of go on a table
media.tenor.com
January 31, 2025 at 11:27 PM
Discovered a gem of a paper, which merges two things I am very excited about: DeepRL + (dependently) type-directed program search.

Highly recommended read.

https://arxiv.org/abs/2407.00695
February 21, 2025 at 5:18 PM