- Heavily over-weight ratings
- Over-weight cheaper items when ratings are matched
- Are swayed by trivial order effects
- Fall for simple nudges (e.g. “Best seller”)
These are systematic, often large effects.
🧵5/9
- Heavily over-weight ratings
- Over-weight cheaper items when ratings are matched
- Are swayed by trivial order effects
- Fall for simple nudges (e.g. “Best seller”)
These are systematic, often large effects.
🧵5/9
Rather, they are strongly biased by these cues. We found agents are often 3-10x+ more susceptible to nudges and superficial attribute differences than our human baseline.
🧵4/9
Rather, they are strongly biased by these cues. We found agents are often 3-10x+ more susceptible to nudges and superficial attribute differences than our human baseline.
🧵4/9
Current agent evals mostly measure competence, but miss behavior e.g. are their decisions stable, rational, manipulable, human-like?
We introduce ABxLAB, a framework for studying agent behavior. Using it we create an agentic consumer behavior benchmark.
🧵1/9
Current agent evals mostly measure competence, but miss behavior e.g. are their decisions stable, rational, manipulable, human-like?
We introduce ABxLAB, a framework for studying agent behavior. Using it we create an agentic consumer behavior benchmark.
🧵1/9
@nikhilsinghmus.bsky.social
Our method learns useful audio representations with randomly synthesized sounds (often better than real data!)
🌐Project: doppelgangers.media.mit.edu
📄Paper: arxiv.org/abs/2406.05923
🧵1/3
@nikhilsinghmus.bsky.social
Our method learns useful audio representations with randomly synthesized sounds (often better than real data!)
🌐Project: doppelgangers.media.mit.edu
📄Paper: arxiv.org/abs/2406.05923
🧵1/3
w/ Nikhil Singh* (@nikhilsinghmus.bsky.social) and Pattie Maes
🔗 openreview.net/forum?id=chb...
🧵 1/3
w/ Nikhil Singh* (@nikhilsinghmus.bsky.social) and Pattie Maes
🔗 openreview.net/forum?id=chb...
🧵 1/3