Alexis Stamatakis
banner
stamatak.bsky.social
Alexis Stamatakis
@stamatak.bsky.social
ERA Chair at Institute of Computer Science FORTH
Research Group Leader Heidelberg Institute for Theoretical Studies
Full Professor at Karlsruhe Institute of Technology
Crete lab: https://www.biocomp.gr/
Heidelberg Lab: http://www.exelixis-lab.org/
Join us in Crete in May 2026 for the Mathematical and Computational Evolutionary Biology Meeting mceb2026.sciencesconf.org on the island's northern coastline near Heraklion.
Mathematical and Computational Evolutionary Biology, Crete, 2026 - Sciencesconf.org
Welcome to MCEB 2026
mceb2026.sciencesconf.org
November 3, 2025 at 9:29 AM
October 17 is your last chance to register for the 2nd conference on Machine Learning for Evolutionary Genomics Data (Dec 8-12), in the French Alps at legend2025.sciencesconf.org
The conference talks are online at legend2025.sciencesconf.org/data/book_le...
legend2025 : Machine Learning for Evolutionary Genomics Data - Sciencesconf.org
Evolutionary genomics and population genetics investigate patterns of genetic diversity between species or between populations within a species and play a fundamental role in many aspects, from theoretical facets of evolution to practical ones, such as conservation genetics and biomedical sciences.
legend2025.sciencesconf.org
October 13, 2025 at 11:24 AM
New preprint: ML tree inference tools such as IQ-Tree and RAxML-NG do typically not overfit the data during tree searches: www.biorxiv.org/content/10.1...
A Systematic Investigation of Overfitting in Maximum Likelihood Phylogenetic Inference
Maximum Likelihood (ML) tree inference reconstructs phylogenies from Multiple Sequence Alignments (MSAs). Since MSAs are inherently noisy, ML tools may experience overfitting, whereby the inferred top...
www.biorxiv.org
October 8, 2025 at 4:03 PM
Want to spend time in the French Alps and talk about Machine Learning for Evolutionary Genomics Data? Join us for the 2nd Legend conference - abstract submission deadline is on September 22 legend2025.sciencesconf.org
legend2025 : Machine Learning for Evolutionary Genomics Data - Sciencesconf.org
legend2025.sciencesconf.org
September 1, 2025 at 2:52 PM
Do you fancy spending some days in the French Alps in December and talk about Machine Learning for Evolutionary Genomics Data? Join us for the 2nd Legend conference: legend2025.sciencesconf.org
June 18, 2025 at 3:10 PM
Check out our new preprint on reproducible parallel phylogenetic inference under varying core counts - it also includes a generic method for reproducible parallel associative reduction operations www.biorxiv.org/content/10.1...
Bit-Reproducible Phylogenetic Tree Inference under Varying Core-Counts via Reproducible Parallel Reduction Operators
Motivation: Phylogenetic trees describe the evolutionary history among biological species based on their genomic data. Maximum Likelihood (ML) based phylogenetic inference tools search for the tree and evolutionary model that best explain the observed genomic data. Given the independence of likelihood score calculations between different genomic sites, parallel computation is commonly deployed. This is followed by a parallel summation over the per-site scores to obtain the overall likelihood score of the tree. However, basic arithmetic operations on IEEE 754 floating-point numbers, such as addition and multiplication, inherently introduce rounding errors. Consequently, the order by which floating-point operations are executed affects the exact resulting likelihood value since these operations are not associative. Moreover, parallel reduction algorithms in numerical codes re-associate operations as a function of the core count and cluster network topology, inducing different round-off errors. These low-level deviations can cause heuristic searches to diverge and induce high-level result discrepancies (e.g., yield topologically distinct phylogenies). This effect has also been observed in multiple scientific fields, beyond phylogenetics. Results: We observe that varying the degree of parallelism results in diverging phylogenetic tree searches (high level results) for over 31 % out of 10 130 empirical datasets. More importantly, 8 % of these diverging datasets yield trees that are statistically significantly worse than the best known ML tree for the dataset (AU-test, p < 0.05). To alleviate this, we develop a variant of the widely used phylogenetic inference tool RAxML-NG, which does yield bit-reproducible results under varying core-counts, with a slowdown of only 0 to 12.7 % (median 0.8 %) on up to 768 cores. We further introduce the ReproRed reduction algorithm, which yields bit-identical results under varying core-counts, by maintaining a fixed operation order that is independent of the communication pattern. ReproRed is thus applicable to all associative reduction operations – in contrast to competitors, which are confined to summation. Our ReproRed reduction algorithm only exchanges the theoretical minimum number of messages, overlaps communication with computation, and utilizes fast base-cases for local reductions. ReproRed is able to all-reduce (via a subsequent broadcast) 4.1 · 106 operands across 48 to 768 cores in 19.7 to 48.61 µs, thereby exhibiting a slowdown of 13 to 93 % over a non-reproducible all-reduce algorithm. ReproRed outperforms the state-of-the-art reproducible all-reduction algorithm ReproBLAS (offers summation only) beyond 10 000 elements per core. In summary, we re-assess non-reproducibility in parallel phylogenetic inference, present the first bit-reproducible parallel phylogenetic inference tool, as well as introduce a general algorithm and open-source code for conducting reproducible associative parallel reduction operations. ### Competing Interest Statement The authors have declared no competing interest. European Research Council, https://ror.org/0472cxd90, 882500 European Union, https://ror.org/019w4f821, 101087081
www.biorxiv.org
June 5, 2025 at 3:14 PM
Are you looking for a good excuse to visit Crete? Join us for the EMBO satellite workshop on Biodiversity Genomics - register for free via forms.gle/GRvPxCp2TnYd... limited spots available first-come first-served
March 26, 2025 at 12:47 PM
Check out raxtax, our new open-source tool for taxonomic classification of barcoding sequences, it's 2.7-1000 times faster than competing tools and also implements fancy uncertainty scores: www.biorxiv.org/content/10.1...
raxtax: A k-mer-based non-Bayesian Taxonomic Classifier
Taxonomic classification in biodiversity studies is the process of assigning the anonymous sequences of a marker gene (barcode) to a specific lineage using a reference database that contains named seq...
www.biorxiv.org
March 19, 2025 at 8:25 AM
Are you analyzing genotype data via PCA or MDS, in particular including ancient DNA samples? Here is a novel easy-to-use tool to assess the stability of these analyses by bootstrapping the SNPs: academic.oup.com/bioinformati...
Pandora: A Tool to Estimate Dimensionality Reduction Stability of Genotype Data
AbstractMotivation. Genotype datasets typically contain a large number of single nucleotide polymorphisms for a comparatively small number of individuals.
academic.oup.com
March 6, 2025 at 4:04 PM
The next edition of our LEGEND conference on Machine Learning for Evolutionary Genomics Data will take place in Aussois (French Alps) from Dec 8-12 2025.

All practical information and a list of our keynote speakers are available at: legend2025.sciencesconf.org

We hope to meet you there again!
February 24, 2025 at 10:54 AM
By dynamic CPU clock frequency scaling our EcoFreq tool reduces your energy consumption and CO2 footprint by 15-18% while only experiencing a 10% throughput decrease. The tool is free of charge for academic use.

A short video explaining EcoFreq: youtu.be/cpw--Tsbib4
The EcoFreq Tool - compute with cleaner & cheaper energy
YouTube video by Alexandros Stamatakis
youtu.be
December 16, 2024 at 4:19 PM
News article about the phenomenon of specimen-drain from the global South to the global North by award-winning science journalist and good friend Vasiliki Michopoulou - on 3 out of 35 ancient DNA papers with Greek samples, Greek scientists were first and/or last authors
www.dnews.gr/eidhseis/sci...
Μετά το brain - drain η Ελλάδα βιώνει και τη διαρροή δειγμάτων για επιστημονική έρευνα - Dnews
Παρότι αρκετές ελληνικές επιστημονικές ομάδες θα μπορούσαν να ηγηθούν ερευνητικών έργων λόγω του πλούτου των αρχαιολογικών και παλαιοντολογικών δειγμάτων της χώρας, συχνά περιορίζονται στον ρόλο του σ...
www.dnews.gr
November 27, 2024 at 8:28 AM
Another new term we invented together with a colleague is specimen-drain: the export of valuable biodiversity, ancient DNA, or other samples to countries of the global North that have money to process them coupled with losing the lead on the respective research papers
November 14, 2024 at 6:48 AM
"To yield Greece more competitive, substantial increases of R&D expenditure and a long term strategic development plan are required such that the country becomes more than a tourist destination in the European periphery." - see our policy paper for more: www.frontiersin.org/journals/pol...
Frontiers | Necessary reforms in the Greek academic system
www.frontiersin.org
November 6, 2024 at 3:45 PM
Last reminder - applications for our 2025 school on computational molecular evolution in Crete close in one week - there will be no application deadline extensions.
meetings.embo.org/event/25-com...
Computational Molecular Evolution
The need for effective and informed analysis of biological data is increasing with the explosive growth of genomic data. A phylogenetic framework is central to many molecular evolutionary approaches …
meetings.embo.org
October 25, 2024 at 7:32 AM
Applications are still open for our 2025 school on computational molecular evolution in Crete, the application deadline is November 1st - only about 10 days from now.
meetings.embo.org/event/25-com...
Computational Molecular Evolution
The need for effective and informed analysis of biological data is increasing with the explosive growth of genomic data. A phylogenetic framework is central to many molecular evolutionary approaches …
meetings.embo.org
October 21, 2024 at 6:56 AM
Do you want to rapidly predict bootstrap values via machine learning? You can now use our Educated Bootstrap Guesser: academic.oup.com/mbe/advance-...
Predicting Phylogenetic Bootstrap Values via Machine Learning
Abstract. Estimating the statistical robustness of the inferred tree(s) constitutes an integral part of most phylogenetic analyses. Commonly, one computes
academic.oup.com
October 20, 2024 at 5:51 AM
Before somebody else comes up with it, here is a new term I invented: brain-redrain

Definition: brains that return to their underdeveloped home country in the hope to help improving things but then leave again since they are frustrated because nothing will ever change there.
October 19, 2024 at 6:47 AM
applications are still open for our 2025 school on computational molecular evolution in Crete, the application deadline is November 1st - about 3 weeks from now. meetings.embo.org/event/25-com...
Computational Molecular Evolution
The need for effective and informed analysis of biological data is increasing with the explosive growth of genomic data. A phylogenetic framework is central to many molecular evolutionary approaches …
meetings.embo.org
October 8, 2024 at 6:09 AM
Registrations are open for the 2025 school on computational molecular evolution in Crete, the application deadline is November 1st.
meetings.embo.org/event/25-com...
Computational Molecular Evolution
The need for effective and informed analysis of biological data is increasing with the explosive growth of genomic data. A phylogenetic framework is central to many molecular evolutionary approaches …
meetings.embo.org
September 17, 2024 at 9:38 AM
Registrations are open for the 2025 school on computational molecular evolution in Crete: meetings.embo.org/event/25-com...
Computational Molecular Evolution
The need for effective and informed analysis of biological data is increasing with the explosive growth of genomic data. A phylogenetic framework is central to many molecular evolutionary approaches …
meetings.embo.org
August 7, 2024 at 4:59 AM
Looking for an excuse to visit Greece and learn more about Evolutionary and Comparative Genomics? Join us in Nafplio in November: meetings.embo.org/event/24-gen...
Evolutionary and Comparative Genomics
The increasing availability of genomes and other “omes” for organisms across the tree of life has revolutionized the study of evolutionary processes. The comparative approach focused on different ele…
meetings.embo.org
July 9, 2024 at 8:53 AM
The majority of talks from our "legend2024 : Machine Learning for Evolutionary Genomics Data" conference in Crete are now available on youtube: www.youtube.com/playlist?lis...
Legend 2024 : Machine Learning for Evolutionary Genomics Data - 3-15 May 2024 Heraklion (Greece)
Share your videos with friends, family, and the world
www.youtube.com
July 2, 2024 at 5:04 AM
Here is the prediction for the UEFA EURO 2024 elimination round conducted at ICS-FORTH. Our PhD student Lucia computed 1 million predictions, considering team performance during the group phase only and taking qualifiers into account based on this paper link.springer.com/article/10.1...
June 27, 2024 at 12:42 PM