watson Ξ
watsonix.bsky.social
watson Ξ
@watsonix.bsky.social
Founder @ Arboreal AI 🌳✨
Rooting well-being through small, gentle changes in physiology, emotion, and behavior.
Previously ran AI/ML for Headspace & Ginger.
Cacao enthusiast, lover of nature and dance.
http://watsonxi.org
Arboreal.ai
🏡 Oakland, Bay Area
Cloud cover map... digital.weather.gov?zoom=7&lat=3... (Note you need to scroll to the right time)
National Weather Service - Graphical Forecast
digital.weather.gov
March 13, 2025 at 7:13 AM
Bunch of eclipse timing info... www.timeanddate.com/eclipse/in/u... (put in your location if out of PDT)
Eclipses visible in San Francisco, California, USA
Which upcoming lunar and solar eclipses are visible in San Francisco, California, USA, and what do they look like?
www.timeanddate.com
March 13, 2025 at 7:12 AM
Seems pretty reasonable to me for there to be extensive trees of selection and mixture. Seems hard to have optimality on the key metrics that matter (speed, cost, quality) otherwise

Also in some sense, all neural architectures by their very nature are mixtures of experts! So... Yes :)
February 13, 2025 at 1:50 AM
I believe this would have to assume that such a product would be perfect and that workers that use it would be perfect. Otherwise, companies want support, customization, additions etc. The same reasons that complex open source software can still make money.
February 12, 2025 at 5:36 PM
The immediately adjacent example of open source software would seem to call this statement into question. Where does the "extremely unlikely" part come from with that taken into account?
February 12, 2025 at 5:26 PM
Parea is a solution that is workable but still young. We've been using it with medium success.

Langfuse is something else I have been considering: langfuse.com/docs/dataset...
Datasets & Experiments - LangfuseDiscordGitHubGitHub
Use Langfuse Datasets to create structured experiments to test and benchmark LLM applications.
langfuse.com
February 5, 2025 at 7:07 AM
Yeah it seems to be a weird blind spot for the industry. Weird because it seems like everyone building anything for which they desire defined repeatable behavior with LLMs should need this more than python developers need pytest. Otherwise you wind up with regression whack-a-mole
February 5, 2025 at 6:58 AM
@simonwillison.net do you have a favorite way to do pytest-like LLM evals? Deterministic functions and/or LLM-as-judge
February 5, 2025 at 5:31 AM
@5calls.org is making it easy for us to get in touch with reps and push them to stand up
February 2, 2025 at 8:47 PM
Is there a reason there's an email wall before I can get any useful info ? I'm sure that turns a lot of ppl around. 5calls.org mentioned above seems lower friction
February 2, 2025 at 6:41 PM
My understanding is you can't. But you can use a pin 📌 emoji apparently in a comment combined with a plugin or add-on of some kind. To much for me

Of course you can always copy out the link
January 28, 2025 at 8:21 AM
💯

Seems like a huge opportunity for a group to come in and build FB lite on top of the AT protocol, even if they want to keep BSky an X clone
January 23, 2025 at 7:21 AM
v cool Carson. Congrats. I'll be following this
January 22, 2025 at 7:18 AM
I'm compiling my own list here, with yours truly included by pure nepotism

bsky.app/profile/did:...
January 21, 2025 at 7:16 AM
Watch Duty is well built and wonderful for alerting you of fire risk nearby and then providing you with the most detailed up to date information. Kudos to you @lukas.blue
January 12, 2025 at 3:17 AM