At wandb, we've spent years thinking about experiment comparison. We've added new tools for LLM app dev: code, prompts, models, configs, outputs, eval metrics, eval predictions, eval scores..
wandb.me/weave
At wandb, we've spent years thinking about experiment comparison. We've added new tools for LLM app dev: code, prompts, models, configs, outputs, eval metrics, eval predictions, eval scores..
wandb.me/weave
And they nonchalantly said "I'll write it in Redstone", to which I almost let loose a chuckle until...
And they nonchalantly said "I'll write it in Redstone", to which I almost let loose a chuckle until...