Lightnews — Scholar-powered news

Louis Ohl

@louisohl.bsky.social

17 followers 29 following 18 posts

Postdoc @ Linköping University, STIMA division

oshillou.github.io

Posts Replies Media Videos

Louis Ohl

@louisohl.bsky.social

It is intended for a broad audience from the beginning, and ends with an overview of some of the current deep clustering models. It also features multiple code snippets to get started, even a package!
If you want a historical perspective on discriminative clustering, I hope you'll enjoy reading it.

September 11, 2025 at 11:44 AM

Louis Ohl

@louisohl.bsky.social

This paper explores multiple aspects of discriminative clustering: its global framework, the evolution of the genre from the 90s to today, and how it is deeply intertwined with mutual information.

September 11, 2025 at 11:44 AM

Louis Ohl

@louisohl.bsky.social

In addition to that historical journey, we provide examples of such milestones and snippets of code to reproduce them on the fly

An example o blob clustering using discriminative methods with few lines of python

May 12, 2025 at 8:19 AM

Louis Ohl

@louisohl.bsky.social

So how to deal with that? Our tutoria covers the history of genre from the early 90s to modern deep clustering. We show how mutua informztion played a crucial role in its development and present historical milestones we deem relevant.

May 12, 2025 at 8:19 AM

Louis Ohl

@louisohl.bsky.social

However, learning such a model ks tricky, because common statistical tools do not apply when we assume nothing about the data distribution

Bayes theorem on the distribution between clusters and data. Only the cluster proportion can be estimated. Other components cannot be learnt

May 12, 2025 at 8:19 AM

Louis Ohl

@louisohl.bsky.social

When doing unsupervised learning, we have two different ways to build our model. One is discriminative: we assume nothing of the data distribution, and try to infer clusters straight out of it. Implicit hyptheses are built within the model

Bayesian graphs depicting generative vs discriminative models

May 12, 2025 at 8:19 AM

Louis Ohl

@louisohl.bsky.social

This tutorial is intended for both curious readers who know nothing of the genre and a more aware audience.

We hope this tutorial will provide a comprehensive overview, and help develop future research directions for clustering.

So what is it about?

May 12, 2025 at 8:19 AM

Louis Ohl

@louisohl.bsky.social

In summary:

DISCOTEC is an easy method to implement that show good ranking performance, and is essentially compatible with all clustering models. It does not require any hyperparameter. (5/5)

May 9, 2025 at 6:40 AM

Louis Ohl

@louisohl.bsky.social

Since DISCOTEC relies on ensemble, its performance is tied to the number of models used for computing the consensus. This is even stronger for the binarised variant. (4/5)

A screenshot of a figure showing the under simimar conditions, increasing the number of clustering models increases the ranking correlation capabilities of the score

May 9, 2025 at 6:40 AM

Louis Ohl

@louisohl.bsky.social

An interesting advantage is that binarising the consensus matrix drastically improves the ranking of the clustering algorithms. (3/5)

A screenshot of a table of results where the binarised DISCOTEC exhibits stronger ranking correlation than baselines and competitors

May 9, 2025 at 6:40 AM

Louis Ohl

@louisohl.bsky.social

We introduce the DISCOTEC score.

It simply consists in two steps: (i) compute the consensus matrix for a set of clustering algorithms (ii) compute the average distance between connectivities and consensus matrices

Bonus: must link and cannot link constraints are gracefully supported (2/5)

This image is a screenshot of pseudo code for computing the DISCOTEC algorithm

May 9, 2025 at 6:40 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news