Lightnews — Scholar-powered news

Mattie Fellows

@mattieml.bsky.social

FLAIR WINTER/SPRING INTERNSHIP!
We're looking for two exceptional students to join us on research projects in Oxford from January! Please share with anyone who would be interested. Details below :)

Internship - Winter/Spring 2026

We are looking for two talented students to join us for an internship working in FLAIR for 6 months. Students will get the chance to work on current FLAIR projects at the University of Oxford, gaining...

foersterlab.com

September 8, 2025 at 10:22 PM

Reposted by Mattie Fellows

Pablo Samuel Castro

@pcastr.bsky.social

PQN, a recently introduced value-based method (bsky.app/profile/matt...) has a similar data-collection as PPO. Although we see a similar trend as with PPO, but much less pronounced. It is possible our findings are more correlated with policy-based methods.
9/

June 5, 2025 at 2:31 PM

Mattie Fellows

@mattieml.bsky.social

1/2 Offline RL has always bothered me. It promises that by exploiting offline data, an agent can learn to behave near-optimally once deployed. In real life, it breaks this promise, requiring large amount of online samples for tuning and has no guarantees of behaving safely to achieve desired goals.

May 30, 2025 at 8:39 AM

Mattie Fellows

@mattieml.bsky.social

If you're struggling with the bs Overleaf outage, you can try going to: www.overleaf.com/project/[PROJECTID]/download/zip. to download the zip. It seems to sometimes work after a few minutes

May 14, 2025 at 9:03 AM

Mattie Fellows

@mattieml.bsky.social

Excited to be presenting our spotlight ICLR paper Simplifying Deep Temporal Difference Learning today! Join us in Hall 3 + Hall 2B Poster #123 from 3pm :)

arxiv.org

April 25, 2025 at 10:56 PM

Reposted by Mattie Fellows

Jakob Foerster

@jfoerst.bsky.social

PQN puts Q-learning back on the map and now comes with a blog post + Colab demo! Also, congrats to the team for the spotlight at #ICLR2025

Mattie Fellows @mattieml.bsky.social · Mar 20

PQN blog 3/3 👉take a look at Matteo's 5-minute blog covering PQN’s key features, plus a Colab demo with JAX & PyTorch implementations mttga.github.io/posts/pqn/

🔎 For a deeper dive into the theory:
blog.foersterlab.com/fixing-td-pa...
blog.foersterlab.com/fixing-td-pa...

See you in Singapore! 🇸🇬

Simplifying Deep Temporal Difference Learning

A modern implementation of Deep Q-Network without target networks and replay buffers.

mttga.github.io

March 20, 2025 at 11:51 AM

Mattie Fellows

@mattieml.bsky.social

PQN blog 3/3 👉take a look at Matteo's 5-minute blog covering PQN’s key features, plus a Colab demo with JAX & PyTorch implementations mttga.github.io/posts/pqn/

🔎 For a deeper dive into the theory:
blog.foersterlab.com/fixing-td-pa...
blog.foersterlab.com/fixing-td-pa...

See you in Singapore! 🇸🇬

Simplifying Deep Temporal Difference Learning

A modern implementation of Deep Q-Network without target networks and replay buffers.

mttga.github.io

March 20, 2025 at 10:29 AM

Mattie Fellows

@mattieml.bsky.social

PQN Blog 2/3: In this blog we show how to overcome `deadly triad' and stabilise TD using regularisation techniques such as LayerNorm and/or l_2 regularisation, deriving a provably stable deep Q learning update WITHOUT ANY REPLAY BUFFER OR TARGET NETWORKS @jfoerst.bsky.social @flair-ox.bsky.social

Fixing TD Pt II: Overcoming the Deadly Triad

blog.foersterlab.com

March 20, 2025 at 9:01 AM

Reposted by Mattie Fellows

Daniel Ansari 🇨🇦

@numcog.bsky.social

Are academic conferences in the US a thing of the past?

FX Coudert @nanoporous.net · Mar 19

French researcher, going to a conference in Houston, was forbidden entry to US; his work and personal electronics were both confiscated.

Why? Because a “random search” of his cell phone revealed a negative personal opinion on Trump and the Trump administration. www.lemonde.fr/internationa...

Etats-Unis : un chercheur français refoulé pour avoir exprimé « une opinion personnelle sur la politique menée par l’administration Trump »

Le ministre de la recherche a dit sa « préoccupation », mercredi, après cette décision des autorités américaines. Le chercheur du CNRS aurait subi un contrôle aléatoire à son arrivée, avant que son or...

www.lemonde.fr

March 19, 2025 at 6:48 PM

Mattie Fellows

@mattieml.bsky.social

PQN Blog 1/3: TD methods are the bread and butter of RL, yet can have convergence issues when used in practice. This has always annoyed me. Find out below why TD is so unstable and how can we understand this instability better using the TD Jacobian. @flair-ox.bsky.social @jfoerst.bsky.social

Fixing TD Pt I: Why is Temporal Difference Learning so Unstable?

blog.foersterlab.com

March 19, 2025 at 8:36 AM

Mattie Fellows

@mattieml.bsky.social

Super excited to share our paper, Simplifying Deep Temporal Difference Learning has been accepted as a spotlight at ICLR! My fab collaborator Matteo Gallici and I have written a three part blog on the work, so stay tuned for that! :)
@flair-ox.bsky.social
arxiv.org/pdf/2407.04811

arxiv.org

March 18, 2025 at 11:48 AM

Reposted by Mattie Fellows

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

If you're an RL researcher or RL adjacent, pipe up to make sure I've added you here!
go.bsky.app/3WPHcHg

November 9, 2024 at 4:42 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news