Hersh Gupta
banner
hershgupta.com
Hersh Gupta
@hershgupta.com
Lead Applied Scientist, Responsible AI @BCGX | @bostonu.bsky.social alum | Data, AI, and strategy enthusiast | Open-source contributor

Opinions are my own

#bikeboston #coys

📍DC -> BOS
Claude catches on quick
February 28, 2025 at 7:37 PM
An X engineer posted this output from Grok to demonstrate how "good" their LLM is (CW: racism)
February 24, 2025 at 5:40 PM
Basically, if you ask most LLMs for confidence scores, they'll just tell you they're super confident every time.
February 8, 2025 at 2:25 PM
This is the ridiculously long prompt the researchers had to use for 4o to get a *minimum* 7% deviation from empirical accuracy.
February 8, 2025 at 2:25 PM
Anyway, having a simple grammar of data manipulation is something that both SQL and dplyr get right
January 29, 2025 at 5:23 PM
middle managers who've never written a single line of code or built an ml model before
January 25, 2025 at 4:11 PM
I gave deepseek-r1 (q8_0) a math problem and it got there after 10 minutes of non-stop trial and error
January 22, 2025 at 4:09 AM
Researchers: this is _not_ how you evaluate LLMs

www.nature.com/articles/s41...
January 17, 2025 at 12:55 AM
How was this allowed to be published in Nature?
January 17, 2025 at 12:52 AM
I'm not sure if this is the case for newer doctors anymore! My partner was studying for the US medical licensing exam last year and I was surprised to see how many research and social science questions were asked in practice exams
January 14, 2025 at 3:39 AM
@pahlkadot.bsky.social's observations about hiring in government match my own experience and this Odd Lots episode is a great listen, but I'm not sure who at Bloomberg was responsible for the overly editorialized title found on their website
January 13, 2025 at 9:30 PM
Maybe it's too early to tell but AMD missed the opportunity to bifurcate AI prosumers from gamers with something similar to Nvidia's Digits, but the Ryzen AI Max Pro+ seems undercooked in comparison to the GB10
January 8, 2025 at 1:55 AM
Massachusetts should also implement automated enforcement on buses - when DC did it, the immediate retributive effect and efficiency gains encouraged me to take the bus more frequently
December 23, 2024 at 2:41 PM
I only just found out that DSPy has an image adapter implementation for vision models??

This kind of functionality is exactly what I needed, but not a mention of it on the dspy.ai website?
December 13, 2024 at 11:47 PM
Can't forget the Polybahn! The funicular that saves you the climb from the main street to the picturesque university hilltop. Zürich's transit options are more speedy and convenient than those of any US city imo
December 10, 2024 at 5:01 PM
December 7, 2024 at 6:00 PM
"Why are all these legacy software companies in the top right?"
December 7, 2024 at 12:48 PM
@cloudflare.social It'd be great to have an llms.txt (llmstxt.org) on your docs page:
December 6, 2024 at 8:45 PM
managerial class, take note - this is a signal that an AI company is legit:
November 29, 2024 at 3:26 AM
you know the devs are cracked when the company landing page looks like this
November 29, 2024 at 3:17 AM
Did Claude hint at an Anthropic VS Code extension or was it just hallucinating? (probably the latter, but 🤞🏾)
November 25, 2024 at 9:14 PM
The algorithm it uses is worth a read: smoores.gitlab.io/storyteller/...
November 25, 2024 at 9:02 PM
llava 1.6 mistral 7b does pretty decently at guess the fruit:
November 24, 2024 at 5:27 AM