Thomas
banner
bthomas.bsky.social
Thomas
@bthomas.bsky.social
Data, AutoML, AI and NLP @graphext. Also CSVs, lots of CSVs. In former life: artificial, sensorimotor life: https://www.amazon.com/dp/B071S22KSC
Reposted by Thomas
Se llama Quantus Skin y forma parte de una inversión de 1,6 millones de euros en automatizaciones de Osakidetza.

Los especialistas la critican por sus resultados "pobres" y "peligrosos". Además fue entrenada solo con pacientes blancos.

‼️ No detecta 1 de cada 3 melanomas.
¿Lunar o cáncer? El algoritmo que se equivoca en uno de cada tres melanomas y obvia a los pacientes con la piel oscura
El País Vasco trabaja en la implantación de Quantus Skin en sus centros sanitarios tras una inversión de 1,6 millones de euros. Los especialistas critican el sistema de inteligencia artificial de una…
civio.es
June 26, 2025 at 5:30 AM
Makes sense. Thanks! Though perhaps having two accounts posting the same content isn't ideal?
March 25, 2025 at 11:01 AM
Somebody forgot to subdivide the original low poly design sketch
January 25, 2025 at 10:04 AM
Not sure if the result would be what you're looking for, but many cluster algorithms accept precalculated (sparse) distance/similarity matrices as input, such as hdbscan. Maybe worth trying? hdbscan.readthedocs.io/en/0.8.6/api...
API Reference — hdbscan 0.8.1 documentation
hdbscan.readthedocs.io
January 21, 2025 at 11:15 PM
Reposted by Thomas
OK, I thought this was pretty clear, but to spell it out: we've gone from gatekeepers that, for all their blind spots, operate in public & are subject to some accountability... to algorithmic gatekeepers that are opaque, unaccountable, & designed by a tiny, blinkered class of dudebro dipshits.
January 14, 2025 at 7:45 PM
Aquí lo tienes clusterizado en Graphext : dev-embeds.graphext.com/883a5045e329.... Pero la verdad es que tampoco sale nada extraordinario
January 10, 2025 at 7:26 PM
Yes, I always train on the whole dataset after evaluating on the test set (or x-validating). In my case there is also a lot of shift. Of course you then don't really know if your estimated performance will be representative of future data. Guess it depends if the shifts continue or not?
January 10, 2025 at 6:03 PM
Btw, you may be interested in this churn paper citation network: bsky.app/profile/btho... #AppliedDS. I'm planning to create better and more of those in different areas. Interesting the time-slice paper doesn't seem to be in it. Perhaps it's not indexed by OpenAlex.
I've made this citation network of ~700 papers mentioning "churn" and "customers". I've also included the top 100 papers citing or cited by those papers: dev-embeds.graphext.com/5a5ca9660dab.... You can search and filter in any of the 54 variables related to each paper #AppliedDS
January 9, 2025 at 10:17 AM
I did try adding some temporal indicators, in case seasonality mattered, but it didn't. My intuition is that the negative impact of not being able to use the most recent data for training trumped all other potentially interesting ways to use more information.
January 9, 2025 at 10:14 AM
With some additional details of avoiding overlap between train and test set etc. The main problem I found, was that if the churn window is e.g. 3 months, you cannot use the last 3 months for training (because it needs to be the test set). If the data isn't very stable, this can have a great impact
January 9, 2025 at 10:11 AM
All hahaha. This data was monthly over a couple of years. So for each month I created features from previous data for those customers still active that month, and targets (churn or not) from the following months. So each person will have one sample for each month that they were active
January 9, 2025 at 10:08 AM