Lightnews — Scholar-powered news

Thomas

@bthomas.bsky.social

890 followers 220 following 39 posts

Data, AutoML, AI and NLP @graphext. Also CSVs, lots of CSVs. In former life: artificial, sensorimotor life: https://www.amazon.com/dp/B071S22KSC

Posts Replies Media Videos

Reposted by Thomas

Civio

@civio.es

Se llama Quantus Skin y forma parte de una inversión de 1,6 millones de euros en automatizaciones de Osakidetza.

Los especialistas la critican por sus resultados "pobres" y "peligrosos". Además fue entrenada solo con pacientes blancos.

‼️ No detecta 1 de cada 3 melanomas.

¿Lunar o cáncer? El algoritmo que se equivoca en uno de cada tres melanomas y obvia a los pacientes con la piel oscura

El País Vasco trabaja en la implantación de Quantus Skin en sus centros sanitarios tras una inversión de 1,6 millones de euros. Los especialistas critican el sistema de inteligencia artificial de una…

civio.es

June 26, 2025 at 5:30 AM

Thomas

@bthomas.bsky.social

Makes sense. Thanks! Though perhaps having two accounts posting the same content isn't ideal?

March 25, 2025 at 11:01 AM

Thomas

@bthomas.bsky.social

Somebody forgot to subdivide the original low poly design sketch

January 25, 2025 at 10:04 AM

Thomas

@bthomas.bsky.social

Not sure if the result would be what you're looking for, but many cluster algorithms accept precalculated (sparse) distance/similarity matrices as input, such as hdbscan. Maybe worth trying? hdbscan.readthedocs.io/en/0.8.6/api...

API Reference — hdbscan 0.8.1 documentation

hdbscan.readthedocs.io

January 21, 2025 at 11:15 PM

Reposted by Thomas

David Roberts

@volts.wtf

OK, I thought this was pretty clear, but to spell it out: we've gone from gatekeepers that, for all their blind spots, operate in public & are subject to some accountability... to algorithmic gatekeepers that are opaque, unaccountable, & designed by a tiny, blinkered class of dudebro dipshits.

January 14, 2025 at 7:45 PM

Thomas

@bthomas.bsky.social

Aquí lo tienes clusterizado en Graphext : dev-embeds.graphext.com/883a5045e329.... Pero la verdad es que tampoco sale nada extraordinario

January 10, 2025 at 7:26 PM

Thomas

@bthomas.bsky.social

Yes, I always train on the whole dataset after evaluating on the test set (or x-validating). In my case there is also a lot of shift. Of course you then don't really know if your estimated performance will be representative of future data. Guess it depends if the shifts continue or not?

January 10, 2025 at 6:03 PM

Thomas

@bthomas.bsky.social

Btw, you may be interested in this churn paper citation network: bsky.app/profile/btho... #AppliedDS. I'm planning to create better and more of those in different areas. Interesting the time-slice paper doesn't seem to be in it. Perhaps it's not indexed by OpenAlex.

Thomas @bthomas.bsky.social · Jan 9

I've made this citation network of ~700 papers mentioning "churn" and "customers". I've also included the top 100 papers citing or cited by those papers: dev-embeds.graphext.com/5a5ca9660dab.... You can search and filter in any of the 54 variables related to each paper #AppliedDS

Screenshot of Open Alex citation network of papers mentioning "churn" and "customer".

January 9, 2025 at 10:17 AM

Thomas

@bthomas.bsky.social

I did try adding some temporal indicators, in case seasonality mattered, but it didn't. My intuition is that the negative impact of not being able to use the most recent data for training trumped all other potentially interesting ways to use more information.

January 9, 2025 at 10:14 AM

Thomas

@bthomas.bsky.social

With some additional details of avoiding overlap between train and test set etc. The main problem I found, was that if the churn window is e.g. 3 months, you cannot use the last 3 months for training (because it needs to be the test set). If the data isn't very stable, this can have a great impact

January 9, 2025 at 10:11 AM

Thomas

@bthomas.bsky.social

All hahaha. This data was monthly over a couple of years. So for each month I created features from previous data for those customers still active that month, and targets (churn or not) from the following months. So each person will have one sample for each month that they were active

January 9, 2025 at 10:08 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news