Igor Martayan
banner
imartayan.bsky.social
Igor Martayan
@imartayan.bsky.social
PhD student in algorithmic bioinformatics at @bonsaiseqbioinfo.bsky.social.
Interested in randomized algorithms and space-efficient data structures
https://igor.martayan.org
Pinned
I'm glad to announce that the simd-minimizers library is out! 🧬🖥️
@curiouscoding.nl and I have been optimizing the computation of minimizers down to the smallest detail.
The result is an order of magnitude faster than existing methods ; processing an entire human genome takes only 4s on my laptop! 🧵
Reposted by Igor Martayan
The simplex algorithm is super efficient. 80 years of experience says it runs in linear time. Nobody can explain _why_ it is so fast.

We invented a new algorithm analysis framework to find out.
Beyond Smoothed Analysis: Analyzing the Simplex Method by the Book
Narrowing the gap between theory and practice is a longstanding goal of the algorithm analysis community. To further progress our understanding of how algorithms work in practice, we propose a new alg...
arxiv.org
October 27, 2025 at 1:43 AM
Reposted by Igor Martayan
Really exciting that the preprint on Barbell, a new demultiplexer, is finally out!
It's the first tool that builds on Sassy, the approximate-DNA-searching tool that @rickbitloo.bsky.social and myself developed earlier this year, specifically with this application in mind.
Around 10% of your Nanopore reads (SQK-RBK114) are incorrectly trimmed. Here is why, and how our new tool Barbell solves it:

www.biorxiv.org/content/10.1...

Want to get started? github.com/rickbeeloo/b...
October 23, 2025 at 9:28 PM
Reposted by Igor Martayan
1/6 Movi 2 is here: faster and more space-efficient for pangenome queries. Its fastest mode uses half the memory of Movi 1 while running ~30% faster. github.com/mohsenzakeri...
GitHub - mohsenzakeri/Movi: Fast, Cache-Efficient, and Scalable Queries on Pangenomes
Fast, Cache-Efficient, and Scalable Queries on Pangenomes - mohsenzakeri/Movi
github.com
October 21, 2025 at 8:00 PM
Reposted by Igor Martayan
Movi 2: Fast and Space-Efficient Queries on Pangenomes. #Pangenomes #SequenceQueries #Genomics #Bioinformatics @biorxiv-genomic.bsky.social 🧬 🖥️
www.biorxiv.org/content/10.1...
October 21, 2025 at 1:49 PM
Reposted by Igor Martayan
So what's the equivalent of `perf record && perf report` on a MacBook?

I want to see the generated assembly and which lines are hot.
October 11, 2025 at 1:48 PM
Reposted by Igor Martayan
Ca n'est pas si souvent, un article publié dans Nature met ma communauté à l'honneur (la bioinformatique des séquences). Je vous raconte ?
www.nature.com/articles/d41...
‘Google for DNA’ brings order to biology’s big data
MetaGraph compresses vast data archives into a search engine for scientists, opening up new frontiers of biological discovery.
www.nature.com
October 9, 2025 at 3:00 PM
Reposted by Igor Martayan
"OpenZL is our answer to the tension between the performance of format-specific compressors and the maintenance simplicity of a single executable binary."
engineering.fb.com/2025/10/06/d...
October 6, 2025 at 8:58 PM
Reposted by Igor Martayan
We are alarmed by reports that Germany is on the verge of a catastrophic about-face, reversing its longstanding and principled opposition to the EU’s Chat Control proposal which, if passed, could spell the end of the right to privacy in Europe. signal.org/blog/pdfs/ge...
signal.org
October 3, 2025 at 4:14 PM
Reposted by Igor Martayan
#RECOMB2026 is now accepting submissions and we'd love to see your best work!

📌 Abstract registration: Nov 7, 2025
📌 Full paper submission: Nov 14, 2025

📜 More info: recomb.org/recomb2026/call_for_papers.html
RECOMB 2026 | CALL FOR PAPERS
Call For Papers
recomb.org
October 2, 2025 at 12:00 PM
Reposted by Igor Martayan
🦒Long read giraffe is out!🦒
Mapping long reads to pangenome graphs is ~10x faster than with GraphAligner, with veeery slightly better mapping accuracy, short variant calling, and SV genotyping than GraphAligner or Minimap2
Rapid, accurate long- and short-read mapping to large pangenome graphs with vg Giraffe https://www.biorxiv.org/content/10.1101/2025.09.29.678807v1
October 2, 2025 at 6:28 AM
Reposted by Igor Martayan
Alice: fast and haplotype-aware assembly of high-fidelity reads based on MSR sketching https://www.biorxiv.org/content/10.1101/2025.09.29.679204v1
October 1, 2025 at 1:47 AM
Reposted by Igor Martayan
Looking for people to test the latest version of simd-sketch.

It's now 2x as fast at sketching, and supports skipping over kmers containing N and other ambiguous bases (which is only ~35% slower).

'cargo install simd-sketch' is right there under your fingertips ;)

github.com/RagnarGrootK...
GitHub - RagnarGrootKoerkamp/simd-sketch: Compute bottom-s sketches and s-buckets sketches, using simd-minimizers crate.
Compute bottom-s sketches and s-buckets sketches, using simd-minimizers crate. - RagnarGrootKoerkamp/simd-sketch
github.com
October 1, 2025 at 2:38 PM
Reposted by Igor Martayan
There are millions of openly available microbial genomes, but searching them can be slow.

Until now 🥁

Introducing LexicMap, a new alignment tool that lets scientists search these data in minutes, helping track antibiotic resistance, trace outbreaks, and more.

www.ebi.ac.uk/about/news/r...
🦠
How to rapidly search the world’s microbial DNA
By making the world’s microbial DNA easier to explore, LexicMap helps researchers track outbreaks, study antibiotic resistance, and understand microbial diversity.
www.ebi.ac.uk
September 30, 2025 at 9:47 AM
Reposted by Igor Martayan
Vous pouvez soutenir ma proposition à la Cour des Comptes d'examiner les marchés publics de voyagistes, notamment dans l'ESR :
participationcitoyenne.ccomptes.fr/processes/co...
Les marchés publics de voyagistes - Contributions - 2025 - Aidez-nous à enrichir notre programme de travail - Plateforme de participation de la Cour des Comptes
Corps de la contributionLes administrations et opérateurs publics, notamment ceux de l'enseignement supérieur et de la recherche, fond appel à des agences de voyage (FCM, TravelPlanet…) pour l'hôtelle...
participationcitoyenne.ccomptes.fr
September 25, 2025 at 10:28 AM
Reposted by Igor Martayan
#RECOMB2026 will be in Thessaloniki, Greece on May 26-29, 2026. Satellites on May 24-25. Save the date!

Το συνέδριο #RECOMB2026 θα πραγματοποιηθεί στη Θεσσαλονίκη, στις 26-29 Μαΐου 2026. Οι δορυφορικές εκδηλώσεις θα διεξαχθούν στις 24-25 Μαΐου 2026. Σημειώστε την ημερομηνία!
September 26, 2025 at 3:03 PM
Reposted by Igor Martayan
Delighted to finally announce a preprint describing the Q100 project! “A complete diploid human genome benchmark for personalized genomics” For which we finished HG002 to near-perfect accuracy: www.biorxiv.org/content/10.1... 🧵[1/14]
A complete diploid human genome benchmark for personalized genomics
Human genome resequencing typically involves mapping reads to a reference genome to call variants; however, this approach suffers from both technical and reference biases, leaving many duplicated and ...
www.biorxiv.org
September 22, 2025 at 5:01 PM
Reposted by Igor Martayan
Critical part of the President's new $100,000 charge for H1-B visas: The Administration can also offer a $100,000 discount to any person, company, or industry that it wants. Replacing rules with arbitrary discretion.

Want visas? You know who to call and who to flatter.
September 20, 2025 at 1:40 PM
Reposted by Igor Martayan
Minimap2 is very much the hammer in
"When all you have is a hammer, everything looks like a nail."
September 16, 2025 at 8:18 PM
Reposted by Igor Martayan
Blogged about how zstd --long fills the gap between fast and slow-but-high-ratio genome compression methods log.bede.im/2025/09/12/z...
September 12, 2025 at 3:07 PM
Reposted by Igor Martayan
Sometimes you meet absolutely incredible bioinfo-magicians.
It was a huge privilege when @shenwei356.bsky.social
joined our group for a year on an @embl.org sabbatical.
While here, he developed a new way of aligning to
millions of bacteria, called LexicMap 1/n
www.nature.com/articles/s41...
Efficient sequence alignment against millions of prokaryotic genomes with LexicMap - Nature Biotechnology
LexicMap uses a fixed set of probes to efficiently query gene sequences for fast and low-memory alignment.
www.nature.com
September 10, 2025 at 9:12 AM
Reposted by Igor Martayan
Preprint out for myloasm, our new nanopore / HiFi metagenome assembler!

Nanopore's getting accurate, but

1. Can this lead to better metagenome assemblies?
2. How, algorithmically, to leverage them?

with co-author Max Marin @mgmarin.bsky.social, supervised by Heng Li @lh3lh3.bsky.social

1 / N
High-resolution metagenome assembly for modern long reads with myloasm https://www.biorxiv.org/content/10.1101/2025.09.05.674543v1
September 7, 2025 at 11:35 PM
Reposted by Igor Martayan
A wonderful day for @bonsaiseqbioinfo.bsky.social, with @camillemrcht.bsky.social's and @npmalfoy.bsky.social's HDR defenses.
Congrats for the amazing work!

It's a great chance for the team to have you both!
September 4, 2025 at 4:40 PM
Reposted by Igor Martayan
We are glad to announce that the next workshop “Data Structures in Bioinformatics” (DSB 2026) will take place in Venice, Italy, on *February 18-19*, 2026. dsb-meeting.github.io/DSB2026/ Book the dates! #DSB26
DSB 2026 Venice - February 18-19
Workshop Data Structures in Bioinformatics
dsb-meeting.github.io
September 1, 2025 at 6:10 PM
Reposted by Igor Martayan
📣 Deacon 0.8.0 available on Bioconda
- Much faster search and depletion through improved work distribution on multicore systems. My fastq.gz benchmark now runs at 400Mbp/s on Apple M1.
- Dual default match thresholds for greater accuracy

Details: github.com/bede/deacon/...

1k downloads! 🐥
Release 0.8.0 · bede/deacon
Faster filtering on multicore systems through improved work allocation using the Paraseq library (@noamteyssier). Filtering at >1Gbp/s is possible with uncompressed long sequences, and >500Mbp/s is...
github.com
August 11, 2025 at 6:55 PM
Reposted by Igor Martayan
So, the heap I invented over the weekend was introduced as the quickheap by Navarro and Paredes around 2006!
Basically: on each pop, do just enough quicksort to find the smallest element.

My implementation (the 1st??) is 2x to 4x faster than d-ary and binary heaps.

curiouscoding.nl/posts/quickh...
August 11, 2025 at 11:32 PM