Götz-Henrik Wiegand
banner
ghwiegand.bsky.social
Götz-Henrik Wiegand
@ghwiegand.bsky.social
PhD Student in Natural Language Processing @ University St. Gallen (HSG) and Uni-Konstanz
Pinned
Here is a blog post about the theory and idea of our paper „Integrating Attention into State Space Models“. The foundation for my #PhD and a step toward rethinking how we build #LLMs. It’s a less technical take on the ideas and motivation:

blog.nlp-lab.ai/2025/08/19/S...
Bridging Attention and State Space Models - A Systems Theory Perspective
Chair of Siegfried Handschuh | Data Science in Natural Language Processing. Chair of Siegfried Handschuh for Data Science in Natural Language Processing at the University of St. Gallen (HSG).
blog.nlp-lab.ai
Got the #BestPaperAward yesterday at #KDIR #IC3K conference in #Marbella for the paper:

"A Convexity-Dependent Two-Phase Training Algorithm for Deep Neural Networks"

Huge thanks to the team!

There will be a #arXiv version soon! Stay tuned!

#paper #HSG #LLM #Transformers #ML #HSG #StGallen
October 25, 2025 at 3:15 PM
Today I presented our paper "A Convexity-dependent Two-Phase Training Algorithm for Deep Neural Networks" on the #KDIR IC3K conference.

Thank you to the organizers for the great event so far!

Stay tuned for our blog-post on our website 👀🤫
#AI #Optimization #ML
October 23, 2025 at 3:54 PM
Day one of the #KDIR #conference has started. Looking forward for interesting talks an papers around knowledge work.

#ai #llm #ontology #semanticWeb #llm
October 22, 2025 at 7:59 AM
I am on my way to #KDIR conference in spain presenting our latest #paper about our convexity-dependent two-phase training algorithm for deep neural networks.

We are nominated for best student and best paper award!

I am proud to present our DS-NLP Lab there!
October 21, 2025 at 10:49 AM
Reposted by Götz-Henrik Wiegand
🌍 Announcing SamudrACE, our AI climate emulator built so scientists & planners can run “what-if” climate experiments quickly. Traditional models are slow and costly; SamudrACE makes high-quality simulations fast & more accessible. 🧵
October 16, 2025 at 3:05 PM
Reposted by Götz-Henrik Wiegand
📣 Re-launch of a joint ELLIS Reading Group “Mathematics & Efficiency of Deep Learning”, affiliated with the ELLIS Unit Graz and co-organized by ELLIS Members Linara Adylova (🇩🇪 @ruhr-uni-bochum.de) and Olga Saukh (🇦🇹 @tugraz.bsky.social).

Learn more here: sites.google.com/view/efficie...
DLMath&Efficiency
This reading group examines the interplay between the theoretical foundations of deep learning and the practical challenge of making machine learning efficient. On the theory side, we study mathematic...
sites.google.com
September 11, 2025 at 1:32 PM
Reposted by Götz-Henrik Wiegand
you can run the new apertus LLMs fully locally on your (mac) laptop with just 2 lines of code:

pip install mlx-lm
mlx_lm.generate --model swiss-ai/Apertus-8B-Instruct-2509 --prompt "wer bisch du?"

(make sure you have done huggingface-cli login before)
Apertus LLM - a swiss-ai Collection
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
September 5, 2025 at 9:31 PM
Finally the results are all in and the plots are done.
Read the blog post about our benchmarking results of the #Apertus 8B Instruct model on

blog.nlp-lab.ai/2025/09/05/A...

Made with #lm_eval.

Would love to hear your thoughts here!
Great Work @ethz.ch @icepfl.bsky.social @cscsch.bsky.social !
September 5, 2025 at 2:15 PM
Here is a blog post about the theory and idea of our paper „Integrating Attention into State Space Models“. The foundation for my #PhD and a step toward rethinking how we build #LLMs. It’s a less technical take on the ideas and motivation:

blog.nlp-lab.ai/2025/08/19/S...
Bridging Attention and State Space Models - A Systems Theory Perspective
Chair of Siegfried Handschuh | Data Science in Natural Language Processing. Chair of Siegfried Handschuh for Data Science in Natural Language Processing at the University of St. Gallen (HSG).
blog.nlp-lab.ai
September 4, 2025 at 7:17 PM
Reposted by Götz-Henrik Wiegand
Is this the first significant ethical truly open-source AI model.... or just good marketing?

- 70B parameter open weights.
- 15T training tokens.
- Technical report containing exactly how they trained it and what data they used - truly open source and build-able (?).
- Multi-lingual.
- […]
Original post on fosstodon.org
fosstodon.org
September 3, 2025 at 8:17 AM
Reposted by Götz-Henrik Wiegand
EPFL, ETH Zurich, and CSCS released Apertus, Switzerland's first large-scale, multilingual language model (LLM). As a fully open LLM, it serves as a building block for developers and organizations to create their own applications: www.cscs.ch/science/comp... @ethz.ch #AI #Apertus #AIforGood
September 2, 2025 at 8:14 AM
Our research group had the opportunity to present our work-in-progress paper at the #SDS2025 conference:
Integrating the Attention Mechanism into State Space Models
This research forms one of the foundational pillars of my own PhD, and I’m proud to see it take shape in the wider research community.
June 29, 2025 at 4:01 PM