Björn Holzhauer
banner
bjoernstats.bsky.social
Björn Holzhauer
@bjoernstats.bsky.social
Biostatistician that keeps geese, chess CM, Kaggle master, one of the authors of Applied Modelling in Drug Development https://opensource.nibr.com/bamdd/
If you perceive the problem to be that a company might not want to publish, then how would the proposal help? Posting results to clinicaltrials.gov is mandatory for recent trials (with some nuance incl. on timing). Unsurprisingly adherence by industry is extremely high - unlike for academia.
December 11, 2024 at 7:56 AM
If you see journals not publishing negative trials as the problem, you get properly conducted RCTs (eventually) published even if results are. "negative". They just tend to get into lower tier journals (unless they are a large 10,000 patient outcome study, which will still get into a top journal).
December 11, 2024 at 7:56 AM
And I'm unsure how it makes me more sure "that the results are what they appear to be". In journals like NEJM (all journals should do this), you get full protocol etc. (& FDA oversees version control on these) incl. change history, which is makes it clear what the prespecified plan was.
December 11, 2024 at 7:56 AM
Given the limited time to patent expiration and a typical discount rate for moving out the expected sales (assuming they even stay the same with a delayed market entry), the price tag on doing this could easily be 3-digit millions or $1B+.
December 11, 2024 at 7:56 AM
The biggest sticking point is surely the timeline impact. E.g. a 3 month delay from a typical single peer review cycle would already be huge in terms of the timelines of typical drug development plan. If there's two review rounds for both your Phase 3 and your Phase 2b, you've added 1 year.
December 11, 2024 at 7:56 AM
I find these tabular competitions a useful learning tool (tabular data is what I'm dealing with most of the time, just usually much smaller). E.g. I learnt a lot on the internals of CatBoost. I also should write up my thoughts on tuning GBDTs (e.g. don't tune the learning rate, lower is better).
December 1, 2024 at 9:05 PM
My selected solution was a simple average of multiple CatBoost, LightGBM using target encoding for categories, logistic regressions & seed averaged fastai NNs with embeddings of dim 1-4 for all features (numeric ones had low cardinality) and 2 small hidden layers (10 & 5) trained with focal loss.
December 1, 2024 at 9:05 PM
I think it's generally a good idea to not take the performance of early stopping per CV fold, but to rather take the best number of iterations (or epochs) averaging across folds. It's particularly so with such a noisy low information metric, so that was an important part of my solution.
December 1, 2024 at 9:05 PM
The other things much bigger than "placebo effects" is regression to the mean and simple time trends in disease state. The former occurs even in stable chronic conditions once you apply inclusion criteria (this really seems to surprise many non-statisticians).
November 29, 2024 at 2:24 PM
I mean, sure, this clinical trial was conducted long enough ago, that the company is not legally required to report the results. Still, it feels disappointing that it's so hard to find the outcomes for a large(ish) trial on a widely used drug.
November 26, 2024 at 9:31 PM
Maybe the results are just not available, yet? Thanks to Drugs@FDA (www.accessdata.fda.gov/scripts/cder...), I finally found the results in the clinical pharmacology review for the original drug approval (10+ years ago)...
Drugs@FDA: FDA-Approved Drugs
www.accessdata.fda.gov
November 26, 2024 at 9:31 PM
Meanwhile on clinicaltrials.gov the trial still doesn't have results 19 years after being completed. As far as I can tell no results from it have been published in a medical journal, either.
November 26, 2024 at 9:31 PM
... minimal clutter/white background, and large enough font sizes already help so much. (see also also the article by some of my colleagues: doi.org/10.1002/pst....).
How can we make better graphs? An initiative to increase the graphical expertise and productivity of quantitative scientists
Graphics are at the core of exploring and understanding data, communicating results and conclusions, and supporting decision-making. Increasing our graphical expertise can significantly strengthen ou...
doi.org
November 26, 2024 at 8:54 PM
Annotating on the plot rather than making people go looking forth and back to a legend is such a nice way of making your graphics easier to read. In combination with some of the other graphics principles on graphicsprinciples.github.io like well chosen colors (vs. different dashed lines), ...
graphics principles - Welcome
This is the home page for effective visual communication and good graphical principles for quantitative scientists.
graphicsprinciples.github.io
November 26, 2024 at 8:54 PM
At least, I hope you'd agree that the plot with colors, annotations on the plot, large font etc. is better than these slightly exaggerated disasters you might get by taking more of a default approach.
November 26, 2024 at 8:54 PM
Surely, by the most common definition logistic regression is artificial intelligence? I can write it as a single layer neutral network in PyTorch, if that helps?
November 26, 2024 at 11:25 AM