Suvash Thapaliya
suva.sh
Suvash Thapaliya
@suva.sh
programming & etcéteras
I wonder if the folks at @anthropic.com were being somewhat jokingly intentional of the term MCP, given it's "evil" Master Control Program origins(Tron, 1982). 😅
November 4, 2025 at 2:46 PM
How I wish that whole Bsky “how dare you train LLMs on our publicly available Bsky firehose dataset” pitchfork campaign didn’t happen months ago and @hf.co members and authors were still happily sharing their posts here. :/
October 30, 2025 at 4:51 PM
A bit embarrassing to admit, but another thing I've been rather late at using/understanding is WebAuthn(FIDO2) compared to U2F(FIDO). Not having used a FIDO2 compatible hardware key, I had sort of mentally bucketed them together.
But, FIDO2 is a pretty solid improvement/extension over the FIDO. TIL.
October 29, 2025 at 12:10 AM
Super late to the "Gigabit at home" party, but recently updated to it, also moved from Synology to Ubiquiti Router+AP setup, Ethernet where possible.
Finally I can max out on downloading these bulky models from HF & Ollama store. 😅
October 28, 2025 at 3:32 PM
I've been using the same #GPG keys (master, sign., enc. & auth.) on a @yubico.com Yubikey 4 since 2017, (following the drduh guide) extending the expiry every X years.
I'm now considering creating new keys (esp. for RSA/4096 & ed25519) for a new #Yubikey 5.
How is everybody else going about this?
October 24, 2025 at 10:55 AM
Just finished reading “House of Huawei”. What a solid read, would absolutely recommend to anyone trying to understand and forward connect the dots to where we are with technology wars right now.
October 15, 2025 at 7:34 PM
Solid 10/10 post by Thorsten, on letting the results of your LLM(http) calls decide what function to execute next, in a loop, in a loop, ....
Finally did it. I wrote down how to build a code-editing agent.

In 315 lines of code. And yes, it works. Very well.

There is no moat.

Read it here: ampcode.com/how-to-build...
April 21, 2025 at 3:32 PM
Rewatched the Game Theory video by @veritasium.bsky.social again today considering the turn of recent geopolitical events.

A good reminder that being nice, forgiving, clear while still retaliatory/provocable is a pretty good strategy in the majority of cases.
www.youtube.com/watch?v=mScp...
What Game Theory Reveals About Life, The Universe, and Everything
YouTube video by Veritasium
www.youtube.com
April 13, 2025 at 6:29 PM
Been meaning to use the latest Gemini(2.x series) models for a while now. Perfect timing by @strickvl.bsky.social with this set of practical examples in this standalone site.

The only thing that could have made this even better for me would be direct curl examples, but maybe I'm asking too much. 😅
Little weekend project: 'Gemini By Example' – practical examples for building with Google's AI models 🚀

I built this simple site to complement Google DeepMind's Gemini SDK with focused, practical code examples - inspired by the "Go By Example" approach.

geminibyexample.com
Gemini by Example
Learn the Gemini API through annotated examples
geminibyexample.com
April 7, 2025 at 4:13 PM
Last week I read a bunch of articles, all of which were a fun read. Not so heavy on social/sharing etc. these days, but figured I'd share them here regardless, because these were a lot of fun to read slowly through.
March 24, 2025 at 1:28 PM
Gemma3 family of models seem quite good for straight-up text extraction from images(those horrible pdfs).

Along those lines, I remember olmOCR from @ai2.bsky.social released just some weeks ago, based on Qwen VL models.

Curious if somebody is working on olmOCR pipeline with Gemma3 models.
March 13, 2025 at 4:27 PM
On this Saturday, I'm happily watching Veritasium episode on AlphaFold. I still think this is probably the highest impact model (in terms of opening up new future pathways for humanity) that has been made public in the last few years. Nothing else comes even close!
www.youtube.com/watch?v=P_fH...
The Most Useful Thing AI Has Done
YouTube video by Veritasium
www.youtube.com
February 15, 2025 at 12:32 PM
February 13, 2025 at 5:28 PM
This is one of the first few posts I've seen that uses Deepseek model to generate high quality datasets, which then can be used to train the ModernBERT models.

Really neat stuff! Once can easily replace the slower, expensive 3rd party LLM router with a fast, cheap & local model.
Why choose between strong #LLM reasoning and efficient models?

Use DeepSeek to generate high-quality training data, then distil that knowledge into ModernBERT for fast, efficient classification.

New blog post: danielvanstrien.xyz/posts/2025/d...
Distiling DeepSeek reasoning to ModernBERT classifiers
How can we use the reasoning ability of DeepSeek to generate synthetic labels for fine tuning a ModernBERT model?
danielvanstrien.xyz
January 29, 2025 at 11:45 PM
Reposted by Suvash Thapaliya
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

Follow along: github.com/huggingface/...
GitHub - huggingface/open-r1: Fully open reproduction of DeepSeek-R1
Fully open reproduction of DeepSeek-R1. Contribute to huggingface/open-r1 development by creating an account on GitHub.
github.com
January 25, 2025 at 1:29 PM
One of the most fun things about using Open WebUI is how you can intersperse various models in the same conversation, depending on the next task you intend to work on.

Start off with `deepseek-r1`, and the follow up the conversation with something like `phi4` or `qwen2.5-coder`.
January 23, 2025 at 8:31 PM
Maybe a bit late to the club, but it’s pretty interesting to see the inner monologue when interacting with deepseek-r1.
January 23, 2025 at 4:45 PM
How are folks building the "memory" layer in their #llm augmented #workflows, esp. in the context of multi step workflows?
Most of the simple examples I've seen share the memory across all steps, but this doesn't seem "clean". Reminds me of workflow steps all reading/writing from same blob storage.
January 10, 2025 at 2:35 PM
From my own experience building LLM augmented systems this year, I've been telling a lot of my colleagues that most use cases don't really need an "agent". Instead, they need a well defined product workflow with models(llms etc.) in the mix.

It's reassuring to read the recent @anthropic.com blog...
December 21, 2024 at 8:09 PM
Huh! Maybe finally time to switch to my own domain, so I don't have to create another account just to "squat" my original username here.
bsky.app Bluesky @bsky.app · Dec 19
📢 App Version 1.96 is rolling out now (1/6)

In this release: a notifications Mentions tab, reserving your default username when you verify your account with a domain, and other improvements!
December 21, 2024 at 5:34 PM
Over this year, I've done a lot of Phoenix Liveview (@elixir-lang.org) development using @zed.dev for the majority of it, with a good dose of @anthropic.com Sonnet-3.5 assistant.

That being said, I still haven't "Signed In" to Zed! Not sure I'll do that anytime soon, but what am I missing out on?
December 20, 2024 at 11:21 AM
Reposted by Suvash Thapaliya
I'll get straight to the point.

We trained 2 new models. Like BERT, but modern. ModernBERT.

Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.

It's much faster, more accurate, longer context, and more useful. 🧵
December 19, 2024 at 4:45 PM
Given the new announcement of the Willow #quantum chip, I went ahead and rewatched this introductory lecture after a couple of years. Love the energy and the "no-nonsense, let's build up from the basics" approach by the presenter.

Any follow-up recommendations?

www.microsoft.com/en-us/resear...
Quantum Computing for Computer Scientists - Microsoft Research
This talk discards hand-wavy pop-science metaphors and answers a simple question: from a computer science perspective, how can a quantum computer outperform a classical computer? Attendees will learn ...
www.microsoft.com
December 10, 2024 at 8:38 PM
@shrite.sh @samrat.me 👋 Nice seeing you here!
November 26, 2024 at 12:10 PM
Looks like a lot more activity here these days. Not sure where/how everybody's finding people here, but 👋 to everybody who sees this.

And yes, I've heard about starter packs and automated feeds, haven't quite used them yet, what else have I missed?
November 16, 2024 at 1:15 PM