gabrielchua.me
this work is part of a broader suite on work on responsible ai here at GovTech - would love to chat if you're in this space.
this work is part of a broader suite on work on responsible ai here at GovTech - would love to chat if you're in this space.
These classifier are:
- fast ⚡
- accurate & give well-calibrated probabilities ⚖️ (so that we can have differentiated responses)
- zero-shot 🔎 (i.e., teams can use this out of the box)
huggingface.co/collections/...
These classifier are:
- fast ⚡
- accurate & give well-calibrated probabilities ⚖️ (so that we can have differentiated responses)
- zero-shot 🔎 (i.e., teams can use this out of the box)
huggingface.co/collections/...
The goal is to classify whether a user-prompt is irrelevant with respect to the system prompt. 🎯
The goal is to classify whether a user-prompt is irrelevant with respect to the system prompt. 🎯
⚠️ High false-positive rates
⚠️ Poor adaptability to new misuse types
⚠️ Require real-world data, which is often unavailable during pre-production
⚠️ High false-positive rates
⚠️ Poor adaptability to new misuse types
⚠️ Require real-world data, which is often unavailable during pre-production