Lightnews — Scholar-powered news

Alex Lovell-Troy

@lovelltroy.org

There’s still room for a few more at the #ISC25 OpenCHAMI tutorial tomorrow. Come join us and learn why the community is growing so fast! #HPC

app.swapcard.com/event/isc-hi...

Client Challenge

app.swapcard.com

June 12, 2025 at 6:58 AM

Alex Lovell-Troy

@lovelltroy.org

Want to move beyond xcat with a provisioner that’s ready for Confidential Computing?

I’ll be at #ISC25 this week talking about OpenCHAMI.

Free and Open Source with a growing community.

openchami.org

June 8, 2025 at 11:00 AM

Reposted by Alex Lovell-Troy

Alex Lovell-Troy

@lovelltroy.org

I’ve been working on OpenCHAMI for a couple of years now. This is an exciting step for the community!

High Performance Software Foundation (HPSF) @hpsf.bsky.social · Apr 10

OpenCHAMI has officially joined the High Performance Software Foundation (HPSF) 🚀 OpenCHAMI is a cloud-like HPC provisioning & management toolkit for on-premise systems—whether you're managing a few nodes or a full-scale supercomputer. Learn more: hpsf.io/blog/2025/op...

April 10, 2025 at 3:21 PM

Alex Lovell-Troy

@lovelltroy.org

I’ve been working on OpenCHAMI for a couple of years now. This is an exciting step for the community!

High Performance Software Foundation (HPSF) @hpsf.bsky.social · Apr 10

OpenCHAMI has officially joined the High Performance Software Foundation (HPSF) 🚀 OpenCHAMI is a cloud-like HPC provisioning & management toolkit for on-premise systems—whether you're managing a few nodes or a full-scale supercomputer. Learn more: hpsf.io/blog/2025/op...

April 10, 2025 at 3:21 PM

Alex Lovell-Troy

@lovelltroy.org

This is the worst thing I can tell you about Japan.

Alex Lovell-Troy @lovelltroy.org · Jan 27

I spent a week in Japan with wet hands before anyone told me that I needed to carry my own hand towel. Apparently they’ve been pulling this shit on foreigners for centuries.

January 28, 2025 at 12:23 AM

Alex Lovell-Troy

@lovelltroy.org

I spent a week in Japan with wet hands before anyone told me that I needed to carry my own hand towel. Apparently they’ve been pulling this shit on foreigners for centuries.

January 27, 2025 at 11:17 PM

Reposted by Alex Lovell-Troy

Sphexish Quine

@dfeldman.org

Inside of you there are two wolves. Inside each wolf there are zero, one, or two wolves. Write a function to rebalance an arbitrary wolftree B such that it has minimal depth. The function should execute in O(logn) time. Show your work.

January 14, 2025 at 3:19 AM

Reposted by Alex Lovell-Troy

Honeycomb

@honeycomb.io

🌟 Hello BlueSky! 🌟

We’re Honeycomb, the observability platform for teams who manage software that matters. Send any data to our one-of-a-kind data store, solve problems with all the relevant context, and fix issues before your customers find them.

December 5, 2024 at 9:21 PM

Reposted by Alex Lovell-Troy

Alex Lovell-Troy

@lovelltroy.org

Moving from cloud #SRE to #HPC often means recalibrating what metrics matter.

Time to job launch?
Time to completion?
Mean time to job failure?
Time to snapshot recovery?

Cloud makes node loss a non-event. HPC typically doesn’t work that way.

Hamza M @hpcwrangler.com · Nov 30

Let’s make #HPC cool again.

I’ll start

Your five nines of uptime can go suck it. HPC workloads need 100% completion 100% of the time no matter how many nodes or network connections fail.

HPC is the OG king of resiliency.

November 30, 2024 at 2:20 PM

Reposted by Alex Lovell-Troy

GW4 Alliance

@gw4alliance.bsky.social

GW4 Isambard 3 #Supercomputer is officially online🎉🧠 !

Part of a collaboration between the universities of Bath, Bristol, Cardiff and Exeter, alongside partners HPE, NVIDIA and Arm, Isambard 3 will push the boundaries of science.

🔗 https://buff.ly/4g7HtMK

Isambard 3 Supercomputer: Image Credit: Christy Nunns/University of Bristol

December 4, 2024 at 8:05 AM

Alex Lovell-Troy

@lovelltroy.org

I once had to email someone with an important corporate email address.

Last Name: Fuchs
First Initial: E
Inexplicable Extra Letter: X

That’s right fuchsex was his official email address.

I often wonder why the X.

December 3, 2024 at 11:09 PM

Alex Lovell-Troy

@lovelltroy.org

“It’s like watching someone unlock a padlock on a wrench so they can use it to drive a nail”

Why?

“The padlock doesn’t fit the hammer.”

December 2, 2024 at 8:38 PM

Alex Lovell-Troy

@lovelltroy.org

Every single time I’ve been to Bristol, the weather has been fantastic. Highly recommend.

Simon McIntosh-Smith @simonmcs.bsky.social · Dec 2

Bristol looking lovely in the winter sun today. Walking from the University to meet the Brunel Archive to explore collaboration around Isambard-AI. #HPC BriCS

December 2, 2024 at 2:23 PM

Reposted by Alex Lovell-Troy

Glenn K. Lockwood

@glennklockwood.com

I spent the Thanksgiving break typing up my notes from #SC24 which I've posted online. 30% more words than my notes from SC23 (sorry!). Feedback is welcome!

https://buff.ly/41fBhho

#HPC

SC'24 recap

The premiere annual conference of the high-performance computing community, SC24, was held in Atlanta last week, and it attracted a reco...

buff.ly

December 2, 2024 at 7:53 AM

Reposted by Alex Lovell-Troy

KP Upadhyayula

@kp11studios.bsky.social

This is, technically, a sandwich.

Picture of two slices of bread, one stacked on top of the other crust side, with ham and cheese in between

December 1, 2024 at 5:51 PM

Alex Lovell-Troy

@lovelltroy.org

Lotsa fake Astronomy photos on Bluesky these days. Just remember, if they’re not credited, it’s not credible.

NASA has the original feed and does a good job of curation.

apod.nasa.gov/apod/

Astronomy Picture of the Day

A different astronomy and space science related image is featured each day, along with a brief explanation.

apod.nasa.gov

December 1, 2024 at 5:20 PM

Alex Lovell-Troy

@lovelltroy.org

As it turns out, when the Thanksgiving pies don’t last all weekend, you’re allowed to make more pie. Who’s going to stop you?

🥧 Maple Pumpkin
🥧 Bourbon Apple

November 30, 2024 at 9:51 PM

Reposted by Alex Lovell-Troy

David Buchanan

@retr0.id

I should write a bittorrent client

November 30, 2024 at 7:26 AM

Alex Lovell-Troy

@lovelltroy.org

Tail latency has entered the chat!

Mike Sherman @msherman.bsky.social · Nov 30

I'd also recommend looking at these metrics broken down by user/project, and try to make sure your 1% least reliable subset is still doing ok, or at least getting support, since failures are often not evenly distributed.

I really like this post on the topic: rachelbythebay.com/w/2019/07/15...

rachelbythebay.com

November 30, 2024 at 4:58 PM

Reposted by Alex Lovell-Troy

Mike Sherman

@msherman.bsky.social

I'd also recommend looking at these metrics broken down by user/project, and try to make sure your 1% least reliable subset is still doing ok, or at least getting support, since failures are often not evenly distributed.

I really like this post on the topic: rachelbythebay.com/w/2019/07/15...

rachelbythebay.com

November 30, 2024 at 4:57 PM

Reposted by Alex Lovell-Troy

ajdecon

@ajdecon.org

I have a half-written blog post about this that I should finish sometime.

I haven’t seen an SLO framework broadly adopted in HPC, but some sites adopt metrics like:

- % nodes up
- Scheduler RPC latency
- FS latency and BW
- Performance on standard benchmarks, either after maintenance or weekly

November 30, 2024 at 3:43 PM

Alex Lovell-Troy

@lovelltroy.org

Moving from cloud #SRE to #HPC often means recalibrating what metrics matter.

Time to job launch?
Time to completion?
Mean time to job failure?
Time to snapshot recovery?

Cloud makes node loss a non-event. HPC typically doesn’t work that way.

Hamza M @hpcwrangler.com · Nov 30

Let’s make #HPC cool again.

I’ll start

Your five nines of uptime can go suck it. HPC workloads need 100% completion 100% of the time no matter how many nodes or network connections fail.

HPC is the OG king of resiliency.

November 30, 2024 at 2:20 PM

Alex Lovell-Troy

@lovelltroy.org

Github's hosted attestation tooling is really amazing to work with.

github.com/actions/atte...

November 30, 2024 at 1:48 PM

Alex Lovell-Troy

@lovelltroy.org

Another fun experiment following #SC24 has been attested custom builds of iPXE binaries. Thanks to @felicitas.pojtinger.com for the original repo.

Working on how to make it easier for others to use.

github.com/OpenCHAMI/ip...

Attestations · OpenCHAMI/ipxe-binaries

Weekly builds of https://ipxe.org/, with an embedded script that chainloads /config.ipxe. - Attestations · OpenCHAMI/ipxe-binaries

github.com

November 30, 2024 at 1:46 PM

Alex Lovell-Troy

@lovelltroy.org

Inspired by conversations at #SC24, I've started experimenting with using bittorent for loading the #HPC kernel and initrd on large clusters.

Does anyone else have prior art for this that I couldn't find? Papers? Alternatives?

github.com/OpenCHAMI/ar...

GitHub - OpenCHAMI/aria2-initrd: experimental repository for using aria2 to download the kernel/initrd over bittorrent

experimental repository for using aria2 to download the kernel/initrd over bittorrent - OpenCHAMI/aria2-initrd

github.com

November 30, 2024 at 1:23 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news