Data management and NLP/LLMs for information quality.
https://www.eurecom.fr/~papotti/
👥 Authors: Enzo Veltri, Donatello Santoro, Jean-Flavien Bussotti, Paolo Papotti
📄 PDF: https://www.vldb.org/pvldb/vol18/p5303-veltri.pdf
👥 Authors: Enzo Veltri, Donatello Santoro, Jean-Flavien Bussotti, Paolo Papotti
📄 PDF: https://www.vldb.org/pvldb/vol18/p5303-veltri.pdf
Blog: giovannigatti.github.io/trutheval/
Watch: www.youtube.com/watch?v=f0XJ...
Play: github.com/GiovanniGatt...
Blog: giovannigatti.github.io/trutheval/
Watch: www.youtube.com/watch?v=f0XJ...
Play: github.com/GiovanniGatt...
Ask it for a rich list and the same fact is suddenly missing or hallucinated because the output context got longer 😳
LLMs exceed 80% accuracy on single-value questions but accuracy drops linearly with the # of output facts
New paper, details 👇
Ask it for a rich list and the same fact is suddenly missing or hallucinated because the output context got longer 😳
LLMs exceed 80% accuracy on single-value questions but accuracy drops linearly with the # of output facts
New paper, details 👇
new "Community Moderation and the New Epistemology of Fact Checking on Social Media"
with I Augenstein, M Bakker, T. Chakraborty, D. Corney, E
Ferrara, I Gurevych, S Hale, E Hovy, H Ji, I Larraz, F
Menczer, P Nakov, D Sahnan, G Warren, G Zagni
new "Community Moderation and the New Epistemology of Fact Checking on Social Media"
with I Augenstein, M Bakker, T. Chakraborty, D. Corney, E
Ferrara, I Gurevych, S Hale, E Hovy, H Ji, I Larraz, F
Menczer, P Nakov, D Sahnan, G Warren, G Zagni
Our paper, "Retrieve, Merge, Predict: Augmenting Tables with Data Lakes", has been published in TMLR!
In this work, we created YADL (a semi-synthetic data lake), and we benchmarked methods for augmenting user-provided tables given information found in data lakes.
1/
Our paper, "Retrieve, Merge, Predict: Augmenting Tables with Data Lakes", has been published in TMLR!
In this work, we created YADL (a semi-synthetic data lake), and we benchmarked methods for augmenting user-provided tables given information found in data lakes.
1/
As systems increasingly use declarative interfaces on LLMs, traditional optimization falls short
Details 👇
As systems increasingly use declarative interfaces on LLMs, traditional optimization falls short
Details 👇
⏰ 11:00 Session B
Our work, "An LLM-Based Approach for Insight Generation in Data Analysis," uses LLMs to automatically find insights in databases, outperforming baselines both in insightfulness and correctness
Paper: arxiv.org/abs/2503.11664
Details 👇
⏰ 11:00 Session B
Our work, "An LLM-Based Approach for Insight Generation in Data Analysis," uses LLMs to automatically find insights in databases, outperforming baselines both in insightfulness and correctness
Paper: arxiv.org/abs/2503.11664
Details 👇
Leveraging RL with our reward mechanism, we push Qwen-Coder-2.5 7B to performance on par with much larger LLMs (>400B) on the BIRD dataset! 🤯
Model: huggingface.co/simone-papic...
Paper: huggingface.co/papers/2504....
Details 👇
Leveraging RL with our reward mechanism, we push Qwen-Coder-2.5 7B to performance on par with much larger LLMs (>400B) on the BIRD dataset! 🤯
Model: huggingface.co/simone-papic...
Paper: huggingface.co/papers/2504....
Details 👇
RAG struggles with broad, multi-hop questions.
We surpass RAG by up to 20 absolute points in QA performance, even with extreme cache compression (64x smaller)!
Details 👇
RAG struggles with broad, multi-hop questions.
We surpass RAG by up to 20 absolute points in QA performance, even with extreme cache compression (64x smaller)!
Details 👇
It will be in Berlin, June 22th, together with @sigmod2025.bsky.social
Submission deadline: 28 March 2025
www.novasworkshop.org
It will be in Berlin, June 22th, together with @sigmod2025.bsky.social
Submission deadline: 28 March 2025
Our #COLING paper uncovers that tropes are used in 37% of the social posts debating immigration and vaccination
📄 coling-2025-proceedings.s3.us-east-1.amazonaws.com/main/pdf/202...
👇
Our #COLING paper uncovers that tropes are used in 37% of the social posts debating immigration and vaccination
📄 coling-2025-proceedings.s3.us-east-1.amazonaws.com/main/pdf/202...
👇
We have audited the program when it was called Birdwatch and found both promising results and concerning manipulation risks. More details below.👇
We have audited the program when it was called Birdwatch and found both promising results and concerning manipulation risks. More details below.👇
By compressing the data in the KV cache, we squeeze more info in the context.
Presented at @emnlpmeeting.bsky.social, now on MIT Press:
FINCH: Prompt-guided Key-Value Cache Compression for LLMs (TACL 2024)
direct.mit.edu/tacl/article...
More details 👇
By compressing the data in the KV cache, we squeeze more info in the context.
Presented at @emnlpmeeting.bsky.social, now on MIT Press:
FINCH: Prompt-guided Key-Value Cache Compression for LLMs (TACL 2024)
direct.mit.edu/tacl/article...
More details 👇
I'll curate but DM/reply w handle+some info welcome! Also follow @trl-research.bsky.social for updates 🤗
go.bsky.app/4SNSMRj
I'll curate but DM/reply w handle+some info welcome! Also follow @trl-research.bsky.social for updates 🤗
go.bsky.app/4SNSMRj
The graph links data from 77 fact-checking orgs across 36 countries.
🔗 SPARQL Endpoint: purl.org/net/cimplekg...
🔗 KG Explorer: purl.org/net/cimplekg...
🔗 Paper: hal.science/hal-04760374...
The graph links data from 77 fact-checking orgs across 36 countries.
🔗 SPARQL Endpoint: purl.org/net/cimplekg...
🔗 KG Explorer: purl.org/net/cimplekg...
🔗 Paper: hal.science/hal-04760374...
I'm seeking PhD and Post-doc candidates to join my research group in 2025 at EURECOM in the south of France.
- 3 new projects on LLMs
- Full-time positions with competitive salaries and benefits
- English-speaking environment
Interested? Ping me!
I'm seeking PhD and Post-doc candidates to join my research group in 2025 at EURECOM in the south of France.
- 3 new projects on LLMs
- Full-time positions with competitive salaries and benefits
- English-speaking environment
Interested? Ping me!
ACM #CIKM 2024! 🏆
Data voids are gaps in online information, which are often exploit to spread disinformation.
More details 👇
#CIKM2024 #DataVoids #Disinformation #KGs
ACM #CIKM 2024! 🏆
Data voids are gaps in online information, which are often exploit to spread disinformation.
More details 👇
#CIKM2024 #DataVoids #Disinformation #KGs
I'm a professor in the Data Science department at EURECOM, France. 🎓
My research focuses on data management and LLMs to enhance information quality, including data cleaning and misinformation detection.
I'm here mostly for the research, but I occasionally comment on sports and arts.
I'm a professor in the Data Science department at EURECOM, France. 🎓
My research focuses on data management and LLMs to enhance information quality, including data cleaning and misinformation detection.
I'm here mostly for the research, but I occasionally comment on sports and arts.