#deeprl
A new Stigmergic Multi‑Agent Deep RL framework (S‑MADRL) uses virtual pheromone fields for coordination. Simulations with up to eight robots showed better efficiency than MADDPG and MAPPO. Read more: https://getnews.me/stigmergy-inspired-deep-rl-boosts-multi-robot-coordination/ #stigmergy #deeprl
October 7, 2025 at 5:18 PM
Researchers introduced UD3RL, a dual‑decoder reinforcement‑learning model for the close‑enough traveling salesman problem, beating classic heuristics in tour quality and speed. Read more: https://getnews.me/deep-rl-tackles-close-enough-traveling-salesman-problem/ #closeenoughtsp #deeprl
October 6, 2025 at 9:58 AM
A hybrid controller blending deep reinforcement learning with bounded extremum seeking showed faster set‑point convergence and resilience in a time‑varying particle‑accelerator component. https://getnews.me/hybrid-drl-and-bounded-es-improves-control-of-variable-systems/ #hybridcontrol #deeprl
October 6, 2025 at 5:31 AM
MAMC adds multiple actors and critics to deep deterministic RL, surpassing prior models on MuJoCo with higher rewards and faster convergence; the code is released on GitHub. Read more: https://getnews.me/multi-actor-multi-critic-deep-rl-outperforms-state-of-the-art-on-mujoco/ #deeprl #mujoco #mamac
October 3, 2025 at 3:29 AM
Survey classifies reinforcement learning for bipedal robots into end-to-end and hierarchical frameworks, urging unified, efficient designs. Updated Sep 27 2025, 17 pages. Read more: https://getnews.me/survey-of-deep-reinforcement-learning-approaches-for-bipedal-robots/ #deeprl #bipedalrobots
October 1, 2025 at 7:45 AM
XQC adds batch-norm, weight-norm and a distributional loss to Soft Actor-Critic, cutting critic condition numbers and improving sample efficiency on benchmarks. Submitted September 2025. Read more: https://getnews.me/xqc-improves-sample-efficiency-in-deep-reinforcement-learning/ #deeprl #xqc #rl
October 1, 2025 at 3:53 AM
Researchers propose attention‑based deep RL (ADRL‑RE) and scenario planning for after‑sales slot scheduling; ADRL‑RE outperforms rule‑based baselines, while SBP needs less compute. Read more: https://getnews.me/deep-rl-improves-dynamic-after-sales-time-slot-scheduling/ #deeprl #scheduling
September 25, 2025 at 3:22 AM
A new study expands the projected Bellman error framework with multistep λ‑return eligibility traces, showing gradient‑based methods outperform PPO on MuJoCo and MinAtar tasks. Read more: https://getnews.me/gradient-eligibility-traces-boost-deep-reinforcement-learning/ #deeprl #eligibilitytraces
September 23, 2025 at 12:30 AM
A deep‑reinforcement‑learning controller that directly drives the gate of a buck converter showed faster settling and reduced overshoot in simulations versus traditional PWM control. https://getnews.me/deep-rl-direct-gate-control-improves-buck-converter-speed/ #buckconverter #deeprl
September 20, 2025 at 12:32 PM
Deep reinforcement learning control raised lift by 79% and cut drag 65% on an SD7003 wing at Re 60 000, boosting aerodynamic efficiency about 408%, in a high‑Reynolds‑number test. Read more: https://getnews.me/deep-reinforcement-learning-boosts-lift-and-cuts-drag-on-3d-wing/ #deeprl #aerodynamics
September 17, 2025 at 4:32 AM
La Revolución de la IA ya está aquí: del #ReinforcementLearning desde cero hasta la #AGI.
Q-learning, SARSA, LLMs y agentes que aprenden a pensar 🤖✨

📺 Mira el video completo 👉 youtu.be/R6MvIB7DHLU

#InteligenciaArtificial #MachineLearning #DeepRL
La revolución de la IA: Aprendizaje por refuerzo desde cero hasta la Inteligencia Artificial General
YouTube video by En la mente de la máquina, Inteligencia Artificial
youtu.be
August 20, 2025 at 4:11 PM
Do you have specific instructions/workflows for Claude? I just tried this out for the first time last night on some DeepRL material (lecture slides, starter project code) and was pleasantly surprised with how much it helped me learn
August 6, 2025 at 10:40 PM
Rewardless Learning: Human Proxy-Based Reinforcement (#DeepRL) in Human Environments / @lexfridman.bsky.social

bryantmcgill.blogspot.com/2025/07/rewa...

This investigation was inspired by Lex's (@LexFridman) @MIT 6.S091: Introduction to Deep RL.

Soundcloud:
soundcloud.com/bryantmcgill...
Rewardless Learning: Human Proxy-Based Reinforcement (DeepRL) in Human Environments
Bryant McGill · Rewardless Learning: Human Proxy-Based Reinforcement (DeepRL) in Human Environments This investigation was originally...
bryantmcgill.blogspot.com
July 6, 2025 at 2:54 PM
Being unable to scale #DeepRL to solve diverse, complex tasks with large distribution changes has been holding back the #RL community. In this work, we demonstrate that with the right architecture and optimization adjustments, agents can maintain plasticity for large networks.
🚨 Excited to share our new work: "Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning"! 📈

We propose gradient interventions that enable stable, scalable learning, unlocking significant performance gains across agents and environments!

Details below 👇
June 24, 2025 at 1:01 AM
while still optimizing with mixed precision training. Our new SpikeRL implementation is 4.26X faster and 2.25X more energy efficient than state-of-the-art DeepRL-SNN methods. Our proposed SpikeRL framework demonstrates a [8/9 of https://arxiv.org/abs/2502.17496v1]
February 26, 2025 at 5:58 AM
mixed-precision for parameter updates. In our new SpikeRL framework, we have implemented our own DeepRL-SNN component with population encoding, and distributed training with PyTorch Distributed package with NCCL backend [7/9 of https://arxiv.org/abs/2502.17496v1]
February 26, 2025 at 5:58 AM
previous work on SpikeRL, which is a scalable and energy efficient framework for DeepRL-based SNNs for continuous control. In our initial implementation of SpikeRL framework, we depended on the population encoding from the [5/9 of https://arxiv.org/abs/2502.17496v1]
February 26, 2025 at 5:58 AM
optimizations that traditional artificial neural networks have. Researchers have addressed this by combining SNNs with Deep Reinforcement Learning (DeepRL), yet scalability remains unexplored. In this paper, we extend our [4/9 of https://arxiv.org/abs/2502.17496v1]
February 26, 2025 at 5:58 AM
new toy: DeepRL
February 25, 2025 at 6:33 PM
Discovered a gem of a paper, which merges two things I am very excited about: DeepRL + (dependently) type-directed program search.

Highly recommended read.

https://arxiv.org/abs/2407.00695
February 21, 2025 at 5:18 PM
Remarkably, DeepRL networks converged on near-optimal strategies and exhibited the same nontrivial Bayesian-like belief-updating dynamics—despite never being trained on these computations directly. This suggests that inference mechanisms can emerge naturally through reinforcement learning.
February 17, 2025 at 1:43 PM
We tackled this challenge with behavioral experiments in mice, Bayesian theory, and #DeepRL. Using a novel change-detection task, we show how mice and networks adapt on the first trial from a context change by inferring both context and meaning—without trial and error.
February 17, 2025 at 1:43 PM
If you are interested in large language models see my paper below on how we can uncover the biases learned by these models.

Link: neurips2023-enlsp.github.io/papers/paper...

#ReinforcementLearning #FoundationModels #DeepRL #DeepReinforcementLearning #ResponsibleAI #AIBias #LLMs #LanguageModels
February 11, 2025 at 5:56 PM
🔍 The History of Reinforcement Learning (Updated for 2025)

From Thorndike’s cat puzzle box 🐱📦 to DeepMind’s AlphaGo 🤖🏆 to DeepSeek-R1 —how did RL become a key AI breakthrough?

📖 Read the full history:
👉 researchdatapod.com/history-rein...

#AI #ReinforcementLearning #DeepSeek #DeepRL #history
a person is playing a game of go on a table
ALT: a person is playing a game of go on a table
media.tenor.com
January 31, 2025 at 11:27 PM
youtube.com/playlist?lis...

Sergej Levine's DeepRL course is often recommended too. Maybe somewhere there's more updated videos, this is a mix of 4 years old and some videos are 1-2 years old:
CS 285: Deep RL, 2023 - YouTube
Playlist for videos for the UC Berkeley CS 285: Deep Reinforcement Learning course, fall 2023.
youtube.com
January 31, 2025 at 1:12 PM