EunJeong Hwang
ejhwang.bsky.social
EunJeong Hwang
@ejhwang.bsky.social
PhD @ UBC. LLMs/NLP
Co-lead with @yuweiyin.bsky.social

Huge thanks to @veredshwartz.bsky.social, Peter West, Giuseppe Carenini
Paper: huggingface.co/papers/2509....
Code will be released soon!
October 2, 2025 at 11:44 PM
Our findings highlight that:
👉 Social reasoning in LLMs cannot be achieved through optimizing their performance on general reasoning benchmarks alone!
👉It requires explicit modeling of mental states to enable safe, fair, and effective interactions with humans.
October 2, 2025 at 11:44 PM
We also examine mental states (beliefs, desires, intentions, emotions, knowledge).

🔹 ToMA prioritizes intentions > emotions (other dimensions remain similar)
🔹 Uses +5.6% more 1st-order belief than bases, even when both are prompted equally for 0th/1st order states.
October 2, 2025 at 11:44 PM
We analyze 4 scenario types: cooperation, negotiation, persuasion, and conflict.

ToMA outperforms the base under all settings. Its reasoning is more strategic (e.g., compromise, accommodation). Even in failures, ToMA shows more active engagement (e.g., failed persuasion).
October 2, 2025 at 11:44 PM
ToMA adapts effectively to long conversations, sustaining strategic dialogue. When paired with diverse partners, it improves both its own goal completion and its partners’ success.
October 2, 2025 at 11:44 PM
ToMA generates latent mental states and utterances optimized for social interaction goals using dialogue simulation signals. On Sotopia, it improves performance by +18.9% with Qwen2.5-3B and +6.9% with Qwen2.5-7B, while remaining competitive with a GPT-5 nano baseline.
October 2, 2025 at 11:44 PM
👋
November 24, 2024 at 7:16 AM
Also, consider presenting a poster showcasing any ongoing projects or previously presented works from recent conferences. It will be a great chance to get feedback and promote your work!
November 22, 2024 at 8:35 PM