Lightnews — Scholar-powered news

Alex Shtoff

@alexshtf.bsky.social

51 followers 98 following 130 posts

Principal scientist @ TII
Visit my research blog at https://alexshtf.github.io

Posts Replies Media Videos

Alex Shtoff

@alexshtf.bsky.social

🚨New post🚨

@beenwrekt.bsky.social recently started a bit of noise with his post about nonexistence of overfitting, but he has a point. In this post we explore it using simple polynomial curve fitting, *without regularization*, using another interesting basis.

alexshtf.github.io/2025/03/27/F...

March 31, 2025 at 1:22 PM

Alex Shtoff

@alexshtf.bsky.social

Or maybe there's cultural difference of the black people, who may be more afraid of not returning a loan and may do extreme things, such as using the last of their savings, to return it.

This paper seems to focus too much on estimation, and ignores the complexities of modeling.

March 12, 2025 at 11:42 AM

Alex Shtoff

@alexshtf.bsky.social

Reminds me of this slide from a phenomenal tutorial by Prof. Aaditya Ramdas.

January 23, 2025 at 6:48 PM

Alex Shtoff

@alexshtf.bsky.social

From a theoretical perspective, this generalizes binning, since a basis of interval indicators is binning. As a function of any one feature, the FM is a function spanned by the given basis, and as a function of any two features, it is spanned by the basis tensor product.

January 1, 2025 at 9:19 AM

Alex Shtoff

@alexshtf.bsky.social

In this work we propose learning a parametric curve 𝒗ᵢ(𝑥ᵢ) in the embedding space corresponding to some numerical feature 𝑥ᵢ, by using a given basis to blend a set of coefficient vectors.

January 1, 2025 at 9:19 AM

Alex Shtoff

@alexshtf.bsky.social

🚀 New Paper 🚀

This post is about our recent TMLR paper, "Function Basis Encoding of Numerical Features in Factorization Machines", by Alex Shtoff, Elie Abboud, Rotem Stram, and Oren Somekh.

Paper: openreview.net/forum?id=M42...
Code: github.com/alexshtf/con...

January 1, 2025 at 9:19 AM

Alex Shtoff

@alexshtf.bsky.social

ICLR area chairs reviewing papers

December 30, 2024 at 5:58 PM

Alex Shtoff

@alexshtf.bsky.social

Help me here #ML/#LLM X please. I'm pretty sure someone already thought of the extremely simple idea of augmenting each attention layer with additional learnable auxiliary memory in the form of embeddings. Could you point me to papers?

December 13, 2024 at 1:32 PM

Alex Shtoff

@alexshtf.bsky.social

December 3, 2024 at 5:40 AM

Alex Shtoff

@alexshtf.bsky.social

I just found out that many people in the industry say that logistic regression (sigmoid + BCE loss) is not a regression algorithm, but a classification algorithm. And that the name "Logistic **Regression**" is wrong...

How do you call a model for estimating the conditional mean?

December 1, 2024 at 9:04 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news