Lightnews — Scholar-powered news

Alex Shtoff

@alexshtf.bsky.social

51 followers 98 following 130 posts

Principal scientist @ TII
Visit my research blog at https://alexshtf.github.io

Posts Replies Media Videos

Pinned

Alex Shtoff @alexshtf.bsky.social · Nov 24

After a short time here, it's time for a short intro thread.
I like things on the intersection between opimization, numerical analysis, ML, and software engineering. I have a blog, where I write about stuff I like:
alexshtf.github.io

You're welcome to add me to starter packs as you see fit.

Alex Shtoff

Blog on optimization, machine learning, and software development.

alexshtf.github.io

Reposted by Alex Shtoff

Clément Canonne

@ccanonne.github.io

Nicely written blog post by David Eppstein on the Boyer–Moore (deterministic) streaming algorithm to find a majority element in a stream, and its extensions, first to the turnstile model, and then to frequency estimation (Misra–Gries).
11011110.github.io/blog/2025/05... via @theory.report

Turnstile majority

A famous algorithm of Boyer and Moore for the majority problem finds a majority element in a stream of elements while storing only two values, a single tenta...

11011110.github.io

May 6, 2025 at 1:30 PM

Reposted by Alex Shtoff

Samuel Vaiter

@samuelvaiter.com

The Matrix Mortality Problem asks if a given set of square matrices can multiply to the zero matrix after a finite sequence of multiplications of elements. It is is undecidable for matrices of size 3x3 or larger. buff.ly/lLmvvlo

May 1, 2025 at 5:01 AM

Alex Shtoff

@alexshtf.bsky.social

Attending #ICLR2025?
Visit our poster!
A stochastic approach to the subset selection problem via mirror descent.
Today, 3pm, poster #336.

April 26, 2025 at 1:59 AM

Alex Shtoff

@alexshtf.bsky.social

A question to the #math people here. For differential equations there are spectral methods that find approximate solutions in the span of orthogonal bases. Is there a variant for difference equations, and bases of sequences? A good tutorial maybe?

April 12, 2025 at 6:42 AM

Reposted by Alex Shtoff

Samuel Vaiter

@samuelvaiter.com

The Tarski-Seidenberg theorem in logical form states that the set of first-order formulas over the real numbers is closed under quantifier elimination. This means any formula with quantifiers can be converted into an equivalent quantifier-free formula. perso.univ-rennes1.fr/michel.coste...

April 1, 2025 at 5:00 AM

Alex Shtoff

@alexshtf.bsky.social

🚨New post🚨

@beenwrekt.bsky.social recently started a bit of noise with his post about nonexistence of overfitting, but he has a point. In this post we explore it using simple polynomial curve fitting, *without regularization*, using another interesting basis.

alexshtf.github.io/2025/03/27/F...

March 31, 2025 at 1:22 PM

Reposted by Alex Shtoff

TMLR Published Papers

@tmlr-pub.bsky.social

On the Detection of Reviewer-Author Collusion Rings From Paper Bidding

Steven Jecmen, Nihar B Shah, Fei Fang, Leman Akoglu

Action editor: Laurent Charlin

https://openreview.net/forum?id=o58uy91V2V

#collusion #colluders #fraud

January 14, 2025 at 5:07 AM

Reposted by Alex Shtoff

TMLR Published Papers

@tmlr-pub.bsky.social

Function Basis Encoding of Numerical Features in Factorization Machines

Alex Shtoff, Elie Abboud, Rotem Stram, Oren Somekh

Action editor: Andriy Mnih

https://openreview.net/forum?id=M4222IBHsh

#factorization #feature #features

January 11, 2025 at 3:07 PM

Reposted by Alex Shtoff

Carola Doerr

@caroladoerr.bsky.social

Videos of the CNRS optimization conference now online (in French):
- Claire Mathieu : www.youtube.com/watch?v=_ZXZ...
- Gabriel Peyré : www.youtube.com/watch?v=vQOF...
- Jérôme Bolte : www.youtube.com/watch?v=tjkg...
- Axel Parmentier : www.youtube.com/watch?v=DohO...

Enjoy 🙂

www.youtube.com

January 7, 2025 at 6:29 AM

Alex Shtoff

@alexshtf.bsky.social

Fellow AI researchers. Please watch this video. Freya raises a valid concern about the sheer abuse of GenAI on the web, and the damage it does.

Freya Holmér @freya.bsky.social · Jan 2

live in about 10 hours from now - subscribe/mark your calendars/tell your friends/share etc. c:

Generative AI is a Parasitic Cancer
www.youtube.com/watch?v=-opB...

Generative AI is a Parasitic Cancer

YouTube video by Freya Holmér

www.youtube.com

January 4, 2025 at 6:31 PM

Alex Shtoff

@alexshtf.bsky.social

[1/4] When working on ads at Yahoo, we had several 'ad hoc' solutions for various problems, and one of them was exponential moving average (EMA) of observations y₁,y₂...:
xᵢ₊₁=(1 - α)xᵢ+αyᵢ
One of the most overlooked facts is that it is actually online gradient descent!

January 4, 2025 at 1:47 PM

Alex Shtoff

@alexshtf.bsky.social

🚀 New Paper 🚀

This post is about our recent TMLR paper, "Function Basis Encoding of Numerical Features in Factorization Machines", by Alex Shtoff, Elie Abboud, Rotem Stram, and Oren Somekh.

Paper: openreview.net/forum?id=M42...
Code: github.com/alexshtf/con...

January 1, 2025 at 9:19 AM

Alex Shtoff

@alexshtf.bsky.social

ICLR area chairs reviewing papers

December 30, 2024 at 5:58 PM

Reposted by Alex Shtoff

TMLR Published Papers

@tmlr-pub.bsky.social

Your Classifier Can Be Secretly a Likelihood-Based OOD Detector

Jirayu Burapacheep, Yixuan Li

Action editor: Changjian Shui

https://openreview.net/forum?id=FmA1JPWBM8

#classifiers #classifier #classification

December 26, 2024 at 3:06 PM

Alex Shtoff

@alexshtf.bsky.social

One of the overlooked properties of the proximal operator is the formula for composing with a semi-orthogonal matrix:
h(x) = p(A x + b), with A Aᵀ = α I

It is [1]:
proxₜₕ(x) = x + t⁻¹ Aᵀ(proxₜₚ(A x) - A x)

[1] Combettes, Wajs. Signal Recovery by Proximal Forward-Backward Splitting.

December 15, 2024 at 1:04 PM

Alex Shtoff

@alexshtf.bsky.social

Help me here #ML/#LLM X please. I'm pretty sure someone already thought of the extremely simple idea of augmenting each attention layer with additional learnable auxiliary memory in the form of embeddings. Could you point me to papers?

December 13, 2024 at 1:32 PM

Alex Shtoff

@alexshtf.bsky.social

M-Estimation is all you need
Change my mind :)

www.jstor.org/stable/3087324

The Calculus of M-Estimation on JSTOR

Leonard A. Stefanski, Dennis D. Boos, The Calculus of M-Estimation, The American Statistician, Vol. 56, No. 1 (Feb., 2002), pp. 29-38

www.jstor.org

December 12, 2024 at 10:34 AM

Reposted by Alex Shtoff

Gergely Neu

@neu-rips.bsky.social

exciting new work by my truly brilliant postdoc Eugenio Clerico on the optimality of coin-betting strategies for mean estimation!

for fans of: mean estimation, online learning with log loss, optimal portfolios, hypothesis testing with E-values, etc.

dig in:
arxiv.org/abs/2412.02640

On the optimality of coin-betting for mean estimation

Confidence sequences are sequences of confidence sets that adapt to incoming data while maintaining validity. Recent advances have introduced an algorithmic formulation for constructing some of the ti...

arxiv.org

December 4, 2024 at 8:13 AM

Alex Shtoff

@alexshtf.bsky.social

I just found out that many people in the industry say that logistic regression (sigmoid + BCE loss) is not a regression algorithm, but a classification algorithm. And that the name "Logistic **Regression**" is wrong...

How do you call a model for estimating the conditional mean?

December 1, 2024 at 9:04 AM

Alex Shtoff

@alexshtf.bsky.social

@bsky.app What's the excuse for blocking alpindale? Is ML research considered "trolling the community"?

November 28, 2024 at 5:41 PM

Alex Shtoff

@alexshtf.bsky.social

Beautiful!

Michael Dinitz @mdinitz.bsky.social · Nov 26

Just had a new paper hit the arxiv (will appear at NeurIPS '24), joint with Sungjin Im, Thomas Lavastida, Ben Moseley, Aidin Niaparast, and @vsergei.bsky.social: arxiv.org/abs/2411.16030 . I think it's super cool, so a quick thread!

Binary Search with Distributional Predictions

Algorithms with (machine-learned) predictions is a powerful framework for combining traditional worst-case algorithms with modern machine learning. However, the vast majority of work in this space ass...

arxiv.org

November 26, 2024 at 8:30 PM

Reposted by Alex Shtoff

Clément Canonne

@ccanonne.github.io

The PCP theorem, a jewel of theoretical computer science, establishes that any NP statement can be assessed by a randomized verifier who only checks a vanishing fraction of the proof (indeed, a constant # of characters!)

This has had incredible impact, most notably on how ML reviews are conducted

November 26, 2024 at 5:33 AM

Reposted by Alex Shtoff

Motonobu Kanagawa

@motonobu-kanagawa.bsky.social

We are organising the First International Conference on Probabilistic Numerics (ProbNum 2025) at EURECOM in southern France in Sep 2025. Topics: AI, ML, Stat, Sim, and Numerics. Reposts very much appreciated!

probnum25.github.io

November 17, 2024 at 7:06 AM

Alex Shtoff

@alexshtf.bsky.social

Everybody likes complaining about ICLR... but after a rebuttal phase, the authors of one of the papers I reviewed addressed my concerns well, revised the paper according to the reviews of the other reviewers, and I increased their score. At least here - the process worked :)

November 26, 2024 at 8:16 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news