Alex Shtoff
banner
alexshtf.bsky.social
Alex Shtoff
@alexshtf.bsky.social
Principal scientist @ TII
Visit my research blog at https://alexshtf.github.io
🚨New post🚨

@beenwrekt.bsky.social recently started a bit of noise with his post about nonexistence of overfitting, but he has a point. In this post we explore it using simple polynomial curve fitting, *without regularization*, using another interesting basis.

alexshtf.github.io/2025/03/27/F...
March 31, 2025 at 1:22 PM
Or maybe there's cultural difference of the black people, who may be more afraid of not returning a loan and may do extreme things, such as using the last of their savings, to return it.

This paper seems to focus too much on estimation, and ignores the complexities of modeling.
March 12, 2025 at 11:42 AM
Reminds me of this slide from a phenomenal tutorial by Prof. Aaditya Ramdas.
January 23, 2025 at 6:48 PM
From a theoretical perspective, this generalizes binning, since a basis of interval indicators is binning. As a function of any one feature, the FM is a function spanned by the given basis, and as a function of any two features, it is spanned by the basis tensor product.
January 1, 2025 at 9:19 AM
In this work we propose learning a parametric curve 𝒗ᵢ(𝑥ᵢ) in the embedding space corresponding to some numerical feature 𝑥ᵢ, by using a given basis to blend a set of coefficient vectors.
January 1, 2025 at 9:19 AM
🚀 New Paper 🚀

This post is about our recent TMLR paper, "Function Basis Encoding of Numerical Features in Factorization Machines", by Alex Shtoff, Elie Abboud, Rotem Stram, and Oren Somekh.

Paper: openreview.net/forum?id=M42...
Code: github.com/alexshtf/con...
January 1, 2025 at 9:19 AM
ICLR area chairs reviewing papers
December 30, 2024 at 5:58 PM
Help me here #ML/#LLM X please. I'm pretty sure someone already thought of the extremely simple idea of augmenting each attention layer with additional learnable auxiliary memory in the form of embeddings. Could you point me to papers?
December 13, 2024 at 1:32 PM
December 3, 2024 at 5:40 AM
I just found out that many people in the industry say that logistic regression (sigmoid + BCE loss) is not a regression algorithm, but a classification algorithm. And that the name "Logistic **Regression**" is wrong...

How do you call a model for estimating the conditional mean?
December 1, 2024 at 9:04 AM