Probably not. In our study, even experts struggled to verify Reddit health claims using end-to-end systems.
We show why—and argue fact-checking should be a dialogue, with patients in the loop
arxiv.org/abs/2506.20876
🧵1/
Probably not. In our study, even experts struggled to verify Reddit health claims using end-to-end systems.
We show why—and argue fact-checking should be a dialogue, with patients in the loop
arxiv.org/abs/2506.20876
🧵1/
Contrary to fallacious causal conclusions drawn from correlational studies, this experiment found a scripted chatbot increased correct #factChecking solutions compared to unassisted students (N = 156).
doi.org/10.1016/j.ch...
#edu #tech
Contrary to fallacious causal conclusions drawn from correlational studies, this experiment found a scripted chatbot increased correct #factChecking solutions compared to unassisted students (N = 156).
doi.org/10.1016/j.ch...
#edu #tech
In our new preprint, we find that LLMs are susceptible to biased reporting of clinical treatment benefits in abstracts—more so than human experts. 📄🔍 [1/7]
Full Paper: arxiv.org/abs/2502.07963
🧵👇
In some informal tests on non-code problems, it is really good, not o1-pro level but surprisingly capable (and incredibly small & fast!). Big advance.
In some informal tests on non-code problems, it is really good, not o1-pro level but surprisingly capable (and incredibly small & fast!). Big advance.
My new post on substack goes into this more deeply as I personally struggle with how to make sense of all of this.
greypascal.substack.com/p/beyond-pre...
My new post on substack goes into this more deeply as I personally struggle with how to make sense of all of this.
greypascal.substack.com/p/beyond-pre...
QuaLLM-Health: An Adaptation of an LLM-Based Framework for Quantitative Data Extraction from Online Health Discussions
https://arxiv.org/abs/2411.17967
QuaLLM-Health: An Adaptation of an LLM-Based Framework for Quantitative Data Extraction from Online Health Discussions
https://arxiv.org/abs/2411.17967
ja.ma/4i9ghPC
ja.ma/4i9ghPC
Why do we demand superhuman performance from AI while normalizing human imperfection?
greypascal.substack.com/p/the-perfec...
Why do we demand superhuman performance from AI while normalizing human imperfection?
greypascal.substack.com/p/the-perfec...
Starting a list of oncology related peopl. Please tell me more to add. Or any similar listd go.bsky.app/GKXp9Fy @n8pennell.bsky.social
Starting a list of oncology related peopl. Please tell me more to add. Or any similar listd go.bsky.app/GKXp9Fy @n8pennell.bsky.social