Harit Vishwakarma
harit7.bsky.social
Harit Vishwakarma
@harit7.bsky.social
Ph.D. Candidate at UW-Madison

https://harit7.github.io/
Our method learns confidence functions tailored for efficient and reliable auto-labeling. Using these in TBAL boosts the no. of auto-labeled points by up to 60% (while making < 5% auto-labeling errors) compared to baselines like softmax and several training-time and post-hoc calibration techniques.
December 11, 2024 at 5:53 PM
Introducing Colander, our framework for learning optimal confidence functions for TBAL! We formulate the auto-labeling objective as an optimization problem over the space of confidence functions and thresholds.
December 11, 2024 at 5:53 PM
We systematically study the limitations of popular confidence functions like softmax outputs and off-the-shelf calibration techniques. The result? Too few auto-labeled points or large auto-labeling errors.
December 11, 2024 at 5:53 PM
TBAL is a promising auto-labeling technique. It iteratively acquires human labels for small data chunks, trains a model, and auto-labels points where the model's confidence is above a threshold. The goal? Maximize coverage (proportion of auto-labeled points) with bounded auto-labeling error.
December 11, 2024 at 5:53 PM
Excited to present Colander at #NeurIPS2024, our new framework for optimizing confidence functions to make auto-labeling more efficient and reliable. Check out our poster #1906 at today's evening poster session.

Wed, Dec 11, 4:30–7:30 p Poster #1906

Project: harit7.github.io/colander
December 11, 2024 at 5:53 PM