Lightnews — Scholar-powered news

Götz-Henrik Wiegand

@ghwiegand.bsky.social

9 followers 23 following 15 posts

PhD Student in Natural Language Processing @ University St. Gallen (HSG) and Uni-Konstanz

Posts Replies Media Videos

Pinned

Götz-Henrik Wiegand @ghwiegand.bsky.social · Sep 4

Here is a blog post about the theory and idea of our paper „Integrating Attention into State Space Models“. The foundation for my #PhD and a step toward rethinking how we build #LLMs. It’s a less technical take on the ideas and motivation:

blog.nlp-lab.ai/2025/08/19/S...

Bridging Attention and State Space Models - A Systems Theory Perspective

Chair of Siegfried Handschuh | Data Science in Natural Language Processing. Chair of Siegfried Handschuh for Data Science in Natural Language Processing at the University of St. Gallen (HSG).

blog.nlp-lab.ai

Götz-Henrik Wiegand

@ghwiegand.bsky.social

After the #BestPaperAward on the #KDIR #IC3K Conference in Marbella last week, the #paper is now available on #arXiv!

Check it out!

arxiv.org/abs/2510.25366

A Convexity-dependent Two-Phase Training Algorithm for Deep Neural Networks

The key task of machine learning is to minimize the loss function that measures the model fit to the training data. The numerical methods to do this efficiently depend on the properties of the loss fu...

arxiv.org

October 31, 2025 at 8:28 AM

Götz-Henrik Wiegand

@ghwiegand.bsky.social

Got the #BestPaperAward yesterday at #KDIR #IC3K conference in #Marbella for the paper:

"A Convexity-Dependent Two-Phase Training Algorithm for Deep Neural Networks"

Huge thanks to the team!

There will be a #arXiv version soon! Stay tuned!

#paper #HSG #LLM #Transformers #ML #HSG #StGallen

Handshake and handover of the Award by Prof. Lars Nolle

Selfie in front of the INSTICC IC3K Roll-Up Banner.

October 25, 2025 at 3:15 PM

Götz-Henrik Wiegand

@ghwiegand.bsky.social

Today I presented our paper "A Convexity-dependent Two-Phase Training Algorithm for Deep Neural Networks" on the #KDIR IC3K conference.

Thank you to the organizers for the great event so far!

Stay tuned for our blog-post on our website 👀🤫
#AI #Optimization #ML

Me in front of the last slide of the paper presentation.

Take home message: Train Smarter - Not Harder!

October 23, 2025 at 3:54 PM

Götz-Henrik Wiegand

@ghwiegand.bsky.social

Day one of the #KDIR #conference has started. Looking forward for interesting talks an papers around knowledge work.

#ai #llm #ontology #semanticWeb #llm

October 22, 2025 at 7:59 AM

Götz-Henrik Wiegand

@ghwiegand.bsky.social

I am on my way to #KDIR conference in spain presenting our latest #paper about our convexity-dependent two-phase training algorithm for deep neural networks.

We are nominated for best student and best paper award!

I am proud to present our DS-NLP Lab there!

October 21, 2025 at 10:49 AM

Reposted by Götz-Henrik Wiegand

Ai2

@ai2.bsky.social

🌍 Announcing SamudrACE, our AI climate emulator built so scientists & planners can run “what-if” climate experiments quickly. Traditional models are slow and costly; SamudrACE makes high-quality simulations fast & more accessible. 🧵

October 16, 2025 at 3:05 PM

Reposted by Götz-Henrik Wiegand

ELLIS

@ellis.eu

📣 Re-launch of a joint ELLIS Reading Group “Mathematics & Efficiency of Deep Learning”, affiliated with the ELLIS Unit Graz and co-organized by ELLIS Members Linara Adylova (🇩🇪 @ruhr-uni-bochum.de) and Olga Saukh (🇦🇹 @tugraz.bsky.social).

Learn more here: sites.google.com/view/efficie...

DLMath&Efficiency

This reading group examines the interplay between the theoretical foundations of deep learning and the practical challenge of making machine learning efficient. On the theory side, we study mathematic...

sites.google.com

September 11, 2025 at 1:32 PM

Reposted by Götz-Henrik Wiegand

Martin Jaggi

@mjaggi.bsky.social

you can run the new apertus LLMs fully locally on your (mac) laptop with just 2 lines of code:

pip install mlx-lm
mlx_lm.generate --model swiss-ai/Apertus-8B-Instruct-2509 --prompt "wer bisch du?"

(make sure you have done huggingface-cli login before)

Apertus LLM - a swiss-ai Collection

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

September 5, 2025 at 9:31 PM

Götz-Henrik Wiegand

@ghwiegand.bsky.social

Finally the results are all in and the plots are done.
Read the blog post about our benchmarking results of the #Apertus 8B Instruct model on

blog.nlp-lab.ai/2025/09/05/A...

Made with #lm_eval.

Would love to hear your thoughts here!
Great Work @ethz.ch @icepfl.bsky.social @cscsch.bsky.social !

September 5, 2025 at 2:15 PM

Götz-Henrik Wiegand

@ghwiegand.bsky.social

Bridging Attention and State Space Models - A Systems Theory Perspective

Chair of Siegfried Handschuh | Data Science in Natural Language Processing. Chair of Siegfried Handschuh for Data Science in Natural Language Processing at the University of St. Gallen (HSG).

blog.nlp-lab.ai

September 4, 2025 at 7:17 PM

Reposted by Götz-Henrik Wiegand

Vijay Prema

@vjprema.fosstodon.org.ap.brid.gy

Is this the first significant ethical truly open-source AI model.... or just good marketing?

- 70B parameter open weights.
- 15T training tokens.
- Technical report containing exactly how they trained it and what data they used - truly open source and build-able (?).
- Multi-lingual.
- […]

Original post on fosstodon.org

fosstodon.org

September 3, 2025 at 8:17 AM

Reposted by Götz-Henrik Wiegand

CSCS - Swiss National Supercomputing Centre

@cscsch.bsky.social

EPFL, ETH Zurich, and CSCS released Apertus, Switzerland's first large-scale, multilingual language model (LLM). As a fully open LLM, it serves as a building block for developers and organizations to create their own applications: www.cscs.ch/science/comp... @ethz.ch #AI #Apertus #AIforGood

September 2, 2025 at 8:14 AM

Götz-Henrik Wiegand

@ghwiegand.bsky.social

Our research group had the opportunity to present our work-in-progress paper at the #SDS2025 conference:
Integrating the Attention Mechanism into State Space Models
This research forms one of the foundational pillars of my own PhD, and I’m proud to see it take shape in the wider research community.

June 29, 2025 at 4:01 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news