Conversational AI
📄Arxiv: arxiv.org/pdf/2505.07775
💻 GitHub: github.com/beyzabozdag/...
📄Arxiv: arxiv.org/pdf/2505.07775
💻 GitHub: github.com/beyzabozdag/...
It's also important to recognize the incredible works that helped us build SafeArena:
It's also important to recognize the incredible works that helped us build SafeArena:
To find out, we introduce SafeArena (safearena.github.io), a benchmark to assess the capabilities of web agents to complete harmful web tasks. A thread 👇
To find out, we introduce SafeArena (safearena.github.io), a benchmark to assess the capabilities of web agents to complete harmful web tasks. A thread 👇
Introducing Persuade Me If You Can (PMIYC)—a new framework to evaluate (1) how persuasive LLMs are and (2) how easily they can be persuaded! 🚀
📄Arxiv: arxiv.org/abs/2503.01829
🌐Project Page: beyzabozdag.github.io/PMIYC/
Introducing Persuade Me If You Can (PMIYC)—a new framework to evaluate (1) how persuasive LLMs are and (2) how easily they can be persuaded! 🚀
📄Arxiv: arxiv.org/abs/2503.01829
🌐Project Page: beyzabozdag.github.io/PMIYC/
Introducing Persuade Me If You Can (PMIYC)—a new framework to evaluate (1) how persuasive LLMs are and (2) how easily they can be persuaded! 🚀
📄Arxiv: arxiv.org/abs/2503.01829
🌐Project Page: beyzabozdag.github.io/PMIYC/
We examine constituent ordering preferences between humans and LLMs; we present two main findings… 🧵
We examine constituent ordering preferences between humans and LLMs; we present two main findings… 🧵
NN-CIFT slashes data valuation costs by 99% using tiny neural nets (205k params, just 0.0027% of 8B LLMs) while maintaining top-tier performance!
NN-CIFT slashes data valuation costs by 99% using tiny neural nets (205k params, just 0.0027% of 8B LLMs) while maintaining top-tier performance!
Introducing CALM, fully open-source Conversational Agentic Language Models with CALM 8B, CALM 70B, and CALM 405B-excelling in both multi-turn dialogue management & function calling.
🌐Project Page: emrecanacikgoz.github.io/CALM/
🌐 shuhaibm.github.io/refed/
🧵 [1/n]
🌐 shuhaibm.github.io/refed/
🧵 [1/n]
🌐 shuhaibm.github.io/refed/
🧵 [1/n]
@convai-uiuc.bsky.social @gokhantur.bsky.social
In our latest work, "Better Slow than Sorry: Introducing Positive Friction for Reliable Dialogue Systems", we delve into how strategic delays can enhance dialogue systems.
Paper Website: merterm.github.io/positive-fri...
@convai-uiuc.bsky.social @gokhantur.bsky.social
In our latest work, "Better Slow than Sorry: Introducing Positive Friction for Reliable Dialogue Systems", we delve into how strategic delays can enhance dialogue systems.
Paper Website: merterm.github.io/positive-fri...
In our latest work, "Better Slow than Sorry: Introducing Positive Friction for Reliable Dialogue Systems", we delve into how strategic delays can enhance dialogue systems.
Paper Website: merterm.github.io/positive-fri...
cobusgreyling.medium.com/building-con...
cobusgreyling.medium.com/building-con...