Samuele Bortolotti
banner
samubortolotti.bsky.social
Samuele Bortolotti
@samubortolotti.bsky.social
Ph.D. student in Artificial Intelligence at the University of Trento.
Easy to set up and use!

1️⃣ Configurable: can be easily configured with YAML/JSON files.
2️⃣ Intuitive: straightforward to use:
December 10, 2024 at 7:10 PM
📊 8 challenging tasks, all with predefined settings.

3 new benchmarks:
🔢 MNMath for arithmetic reasoning
🛃 MNLogic for SAT-like problems
🚖 SDD-OIA, a synthetic self-driving task!

They can all be made easier or harder with our data generator!
December 10, 2024 at 7:10 PM
🧪 Test your models!

- 🌍 Evaluate concepts in in- and out-of-distribution scenarios.
- 🎯 Ground-truth concept annotations are available for all tasks.
- 📊 Visualize how your models handle different learning & reasoning tasks!
December 10, 2024 at 7:10 PM
🔍 rsbench allows you to:

- 🧮 Run algorithmic, logical, and high-stakes tasks w/ known reasoning shortcuts (RSs).
- 📊 Eval concept quality via F1, accuracy & concept collapse.
- 🛠️ Easily customize the tasks and count RSs a priori using our countrss tool!
December 10, 2024 at 7:10 PM
🤔 What are reasoning shortcuts?

NeSy models might learn wrong concepts but still make perfect predictions!

Example: A self-driving car 🚗 stops in front of a 🚦🔴 or a 🚶. Even if it confuses the two, it outputs the right prediction!
December 10, 2024 at 7:10 PM
🌐 rsbench allows you to evaluate the concepts learned by:

1️⃣ Neuro-Symbolic models (#NeSy)
2️⃣ Concept Bottleneck Models (#CBMs)
3️⃣ Black-box Neural Networks (NNs*)
4️⃣ Vision-Language Models (#VLMs*)

* through post-hoc concept-based explanations (e.g., TCAV)
December 10, 2024 at 7:10 PM