In my research workflow, I directly added it as submodule to my code repo. Now I can produce figures and tables, and have them magically uploaded to Overleaf just by pushing the repo.
No more renaming, keeping versions straight, and manual uploading 😇
In my research workflow, I directly added it as submodule to my code repo. Now I can produce figures and tables, and have them magically uploaded to Overleaf just by pushing the repo.
No more renaming, keeping versions straight, and manual uploading 😇
Previously, I was always downloading CSVs, losing track of file versions, and loading and merging them sluggishly in Python.
👉 find the code here: gist.github.com/timbmg/6c2d6...
Previously, I was always downloading CSVs, losing track of file versions, and loading and merging them sluggishly in Python.
👉 find the code here: gist.github.com/timbmg/6c2d6...
🚀 PeerQA is the solution: a dataset with questions from peer reviews and answers from the original authors. (1/🧵)
#NLProc
🚀 PeerQA is the solution: a dataset with questions from peer reviews and answers from the original authors. (1/🧵)
#NLProc
Ever wonder whether verbalized CoTs correspond to the internal reasoning process of the model?
We propose a novel parametric faithfulness approach, which erases information contained in CoT steps from the model parameters to assess CoT faithfulness.
arxiv.org/abs/2502.14829
Ever wonder whether verbalized CoTs correspond to the internal reasoning process of the model?
We propose a novel parametric faithfulness approach, which erases information contained in CoT steps from the model parameters to assess CoT faithfulness.
arxiv.org/abs/2502.14829
Misinformation is a new weapon disrupting public debates, scientific discussions, and political decisions. How can we identify and counter misleading content?
(1/🧵)
Misinformation is a new weapon disrupting public debates, scientific discussions, and political decisions. How can we identify and counter misleading content?
(1/🧵)
From LLaMa to Gemma, get transparent ⭐️1-5 efficiency ratings.
Incredible work led by @sashamtl.bsky.social
huggingface.co/blog/sasha/a...
From LLaMa to Gemma, get transparent ⭐️1-5 efficiency ratings.
Incredible work led by @sashamtl.bsky.social
huggingface.co/blog/sasha/a...
»PeerQA: A Scientific Question Answering Dataset from Peer Reviews« by Tim Baumgärtner (@timbmg.bsky.social), Ted Briscoe, Iryna Gurevych (@igurevych.bsky.social)
We collected the major criteria used in CogSci and other fields, and designed a survey to find out!
Access link: www.survey-xact.dk/collect
Code: 4S7V-SN4M-S536
Time: 5-10 mins
We collected the major criteria used in CogSci and other fields, and designed a survey to find out!
Access link: www.survey-xact.dk/collect
Code: 4S7V-SN4M-S536
Time: 5-10 mins