Yasser Roudi
banner
yasserroudi.bsky.social
Yasser Roudi
@yasserroudi.bsky.social
Scientist, Immigrant
3/3 Surprisingly, sometimes adding even a single datapoint from outside (e.g. human generated data) may prevent model collapse.
July 10, 2025 at 9:49 AM
2/3 This is what we studied in our recent paper for the Exponential Family, analytically finding some of the important features observed in more complex models: importantly model-collapse where the model's representational power severely decreases.
July 10, 2025 at 9:49 AM
5/5 I also think avoiding binarization/categorizations with hard boundaries will do good for biology.
June 12, 2025 at 6:00 PM
4/5 I think these, combined with the recent discoveries about null hypothesis in TDA, provide a better quantitative insight into topological structure in high-d data and the processes that cause them.
nature.com/articles/s41...
arxiv.org/abs/2406.05553
Our paper is one example.
A universal null-distribution for topological data analysis - Scientific Reports
Scientific Reports - A universal null-distribution for topological data analysis
nature.com
June 12, 2025 at 6:00 PM
3/5 Giovanni has now also released an easy-to-use code that can measure the degree of toroidality, degree of sphericality, degree of circularity and can be further extended to other topological structures.

t.co/MWWXLcOjhP
https://github.com/gdisarra/topology-degree
t.co
June 12, 2025 at 6:00 PM
2/5 Introducing a "degree of toroidality" we avoid binary categorisation (there is toroidal topology - there is no toroidal topology), and propose continuous measures quantifying the degree to which the topology of data resembles that of a reference structure, e.g. torus.
June 12, 2025 at 6:00 PM