Josef Woldense
woldense.bsky.social
Josef Woldense
@woldense.bsky.social
josefwoldense.com
Take pairs where one of the agents has a preference of 1. Next, take pairs where one of the agents has a preference of 5. Now compare them. You can see pairs with a 1 have lower agreement scores than pairs with a 5. This is consistent across preference gaps
September 8, 2025 at 7:03 PM
Our estimate suggests that the suppression of disagreement is quite large. Our counterfactual agreements scores (expected in the graph) are significantly lower than the observed ones, and this is across preference gaps.

(see paper for info in mean shift)
September 8, 2025 at 7:03 PM
To do this, we adopt a simplifying assumption – agents should disagree at the same rate as they agree. We already know one end of this spectrum -- the amount of agreement when agents are aligned (gap = 0). We establish the disagreement side (gap = 4), by assuming it to be the inverse of agreement
September 8, 2025 at 7:03 PM
Looking at the graph, it appears consistent with our expectations, the more closely aligned the agents (smaller preference gap between agents), the higher the agreement score.

But there is a problem. Can you spot it?
September 8, 2025 at 7:03 PM
How do we measure agreement level?

With the aid of an LLM judge, we score each conversation (strongly disagree = 1 – strongly agree = 5). This yields a set of agreement scores for a given preference pair. Using bootstrap sampling, we derive the distribution of average agreement scores (range)
September 8, 2025 at 7:03 PM
We elicit the agents’ preference on a topic (1-5 scale), then pair them in a conversation to see if they follow through on their preferences.

Expectation: The more closely agents align in their preferences, the more strongly they will agree. The further apart, the more they disagree.
September 8, 2025 at 7:03 PM
📢🗣️...Are you a graduate students about to go on the market? Or perhaps you're just interested in research presentations. If so, check out my free two-day workshop:

The Research Presentation as Storytelling
August 13, 2025 at 12:04 PM
There is currently a tornado 🌪️ watch in Minneapolis. And here are two emails I received during this time. Almost seems like someone is pranking me
May 15, 2025 at 7:45 PM
What is the role of quantitative data and how do you create it?
April 23, 2024 at 12:35 PM
Does this mean that a "new card" (i.e., new number) is generated whenever you purchase something from a different merchant? I ask because of their monthly card limit
January 17, 2024 at 7:39 PM