Thinks about airplanes, data, markets, math, my next meal
sethkim.me/l/
You might expect both an LLM and a human to get a handful of data labeling tasks wrong, but have it checked with a verifier/adversarial LLM and you'll likely get ~100% accuracy.
You might expect both an LLM and a human to get a handful of data labeling tasks wrong, but have it checked with a verifier/adversarial LLM and you'll likely get ~100% accuracy.