Deepak Mallampalli
Deepak Mallampalli
@deepkmal.bsky.social
Experimenting with AI in law
Reposted by Deepak Mallampalli
Evaluating LLM output is hard. For many teams, it's the bottleneck to scaling AI-powered product.

A key mistake is defining eval criteria w/o actually LOOKING AT THE DATA. This leads to irrelevant / unrealistic criteria + lots of wasted effort.

Thus I built AlignEval.com
AlignEval: Upload, Label, Evaluate, Optimize
A prototype tool to help you label data, evaluate output, and optimize prompts.
AlignEval.com
October 31, 2024 at 2:11 AM