Tomàs Montserrat Ayuso
banner
tmontsay.bsky.social
Tomàs Montserrat Ayuso
@tmontsay.bsky.social
💻 PhD Student @ Functional Genomics Team, CNAG
🎓 Associate Lecturer in Bioinformatics @ UVic-UCC
📚 Author of biologydatascience.com
🔬 Exploring genomics, data science, and the frontiers of biology
🧵 10/
We hope this helps others working on retroviral domains, paleovirology, or TE functional genomics.

🙏 Thanks to @cnag-eu.bsky.social and @annaesteveco.bsky.social for support.

We welcome feedback, questions, or collaborations as we submit to a peer-reviewed journal.
July 31, 2025 at 6:21 AM
🧵 9/
🧰 All annotations are open-access:

- BED + FASTA files
- InterProScan + Phobius output
- Domain sequences & conservation scores
- Scripts on GitHub

🧬 Data: doi.org/10.5281/zeno...

💻 Code: github.com/funcgen/herv...
Domain-Level Annotations and Conservation Scores for Human Endogenous Retroviruses
💡Introduction:  This dataset provides a comprehensive, genome-wide annotation of conserved retroviral protein domains within human endogenous retroviruses (HERVs). Using a reproducible pipeline based ...
doi.org
July 31, 2025 at 6:21 AM
🧵 8.2
🧠 Intriguingly, HERV activity has been linked to neurodegenerative diseases (ALS, MS, AD) and immune defense.

- HERV-encoded proteins may modulate immunity
- Some may even restrict exogenous viruses

Our dataset enables deeper exploration of these hypotheses.
July 31, 2025 at 6:21 AM
🧵 8.1
🧬 One famous case of HERV co-option is Syncytin, a retroviral Env protein now essential for placenta formation.

Could other HERV proteins—preserved across millions of years—also serve beneficial roles in the human host?
July 31, 2025 at 6:21 AM
🧵 8/
Why does this matter?

✔️ Residual protein function?
✔️ Host co-option?
✔️ Antiviral defense?
✔️ Role in development, immunity, neurodegeneration?

Our resource supports new lines of research into the functional potential of HERV proteins.
July 31, 2025 at 6:21 AM
🧵 7/
🔍 We also found 13 HERVK loci encoding Gag, Pol, and Env with strong domain conservation and intact 5′/3′ LTRs.

- Some domains even share the same ORF—suggesting fused or intact polyproteins.

- A few insert into human gene introns—potential regulatory effects?
July 31, 2025 at 6:21 AM
🧵 6/
💡 Subfamily patterns:

- HERVK: Full polyproteins across several loci

- HERVH: Conserves enzymatic domains, but not structural

- HERVE: Unexpectedly retains protease & RT domains

Young and ancient families both preserve functional fragments!
July 31, 2025 at 6:21 AM
🧵 5/
🧠 3 examples:

- HERVK Env (99.5% coverage, fusion domains) chr5:156658763–156665917

- HERVH RNase H (DEDD motif) chr14:53129175–53135122

- HERV-E protease (full-length) chr1:20154322–20160102

More details in the preprint!
July 31, 2025 at 6:21 AM
🧵 4/
💡 To our surprise, thousands of domains are highly conserved, with >1,000 showing nearly full alignment (>95% coverage).

We even recovered key catalytic motifs (e.g. DEDD in RNase H) and transmembrane regions in Env.

These are not just fossils—they may retain function.
July 31, 2025 at 6:21 AM
🧵 3/
💻 Using a reproducible pipeline (HMMER + InterProScan), we identified 17,540 retroviral domains—incl. Gag, RT, RNase H, protease, integrase, and Env.

We then quantified alignment coverage to assess structural conservation.
July 31, 2025 at 6:21 AM
🧵 2/
🧬 HERVs make up ~8% of the human genome.

Yet no systematic annotation of protein domains within their internal sequences—quantifying structural conservation—has been released.

We analyzed >120,000 ORFs to address this gap.
July 31, 2025 at 6:21 AM