anmorgan.bsky.social
@anmorgan.bsky.social
Reposted
Our video with the Google Partners team is live 🎉

Proud to be part of the ISV Startup Springboard program, officially available as a marketplace offering on GCP.

Big thanks to the Google Cloud + our team at Comet for making this happen 🙌

🎥 Watch the full video www.youtube.com/watch?v=fdhj...
Run LLMs Better, Faster, Safer: Meet Opik on Google Cloud
YouTube video by Google Cloud
www.youtube.com
August 19, 2025 at 3:06 PM
Reposted
🧵When building mental health apps powered by GenAI, evaluation is essential.

We’re proud to see Opik supporting Mirror, a free, research-backed app from the Child Mind Institute that helps teens manage stress & anxiety.
April 4, 2025 at 6:29 PM
🚀 How can we detect LLM hallucinations—without external tools or model intrinsics?

SelfCheckGPT is a zero-resource, reference-free evaluation approach by analyzing the consistency of multiple responses from the same model. Let’s break it down. 🧵👇 (1/11)
March 27, 2025 at 4:15 PM
Single-model evaluations can be biased and inconsistent. LLM Juries, which use multiple models for assessment, offer a more reliable alternative—reducing bias and improving robustness. 🧵 (1/6)
February 24, 2025 at 6:31 PM
LLM-as-a-judge evaluators may seem simple on the surface, but implementing them in real-world applications is challenging.

Evaluating multiple metrics often means separate pipelines that must be combined.

G-Eval simplifies this process 🧵 (1/4)
January 31, 2025 at 4:31 PM