David Nelson
thedavenelson.bsky.social
David Nelson
@thedavenelson.bsky.social
Associate Director at Purdue's Center for Instructional Excellence. I share resources about teaching, SoTL, & the impact of AI on learning.
After multiple months using Copilot pro on trial (normally $30/month per user) I would say its ability to search all your Microsoft stuff at once is the killer app. Its chatbot inference and output abilities are as bad as an airline phone bot that makes you say all your questions.
October 22, 2025 at 9:29 PM
This is an awesome small-scale look at how sycophantic LLMs lead learners astray in problem-solving. Next, I'd love to look at what type of sycophancy actually attracts students if given a choice of bots. arxiv.org/abs/2510.03667
Invisible Saboteurs: Sycophantic LLMs Mislead Novices in Problem-Solving Tasks
Sycophancy, the tendency of LLM-based chatbots to express excessive enthusiasm, agreement, flattery, and a lack of disagreement, is emerging as a significant risk in human-AI interactions. However, th...
arxiv.org
October 15, 2025 at 8:31 PM
Sycophancy in bots is an inimical part of AI in Teaching and Learning. When the bot wants to tell you that you are right, high dependence almost certainly means you will inculcate incorrect knowledge. Love papers like this who explore sycophancy in the discipline arxiv.org/abs/2510.04721
BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
Large language models (LLMs) have recently shown strong performance on mathematical benchmarks. At the same time, they are prone to hallucination and sycophancy, often providing convincing but flawed ...
arxiv.org
October 14, 2025 at 6:24 PM
Skeptical of most 3rd party IT company analyses of other software, but this SCALE AI paper outlines many of the limits of creating and evaluating "AI as Tutor" studies. Too many edge cases. arxiv.org/abs/2510.02663
TutorBench: A Benchmark To Assess Tutoring Capabilities Of Large Language Models
As students increasingly adopt large language models (LLMs) as learning aids, it is crucial to build models that are adept at handling the nuances of tutoring: they need to identify the core needs of ...
arxiv.org
October 6, 2025 at 2:23 PM
Lots of guesses about student use of AI are turning out as expected: Students use them to solve/complete homework; they ignore LLM-posted questions guiding them to think deeper, single-prompts dominate interactions with little dynamic engagement.

arxiv.org/abs/2509.08862
Investigating Student Interaction Patterns with Large Language Model-Powered Course Assistants in Computer Science Courses
Providing students with flexible and timely academic support is a challenge at most colleges and universities, leaving many students without help outside scheduled hours. Large language models (LLMs) ...
arxiv.org
September 15, 2025 at 10:34 PM
Want more studies on small-level feedback mechanisms around disciplinary understandings. No need to have an LLM measure "everything". Start small, start with a need area.
arxiv.org/abs/2508.14823
Using an LLM to Investigate Students' Explanations on Conceptual Physics Questions
Analyzing students' written solutions to physics questions is a major area in PER. However, gauging student understanding in college courses is bottlenecked by large class sizes, which limits assessme...
arxiv.org
August 21, 2025 at 8:37 PM
Purdue's AI Academy finished with 70+ instructors creating projects, plans, tools or critical approaches around and in response to AI. I was particularly enthused when multiple participants said "I thought I was gonna learn the tech, but I learned about learning"
August 19, 2025 at 6:33 PM
Reposted by David Nelson
Today on the podcast: Study Hall! @leaton01.bsky.social @michellemillerphd.bsky.social and @thedavenelson.bsky.social and I discuss three recent studies exploring the intersection of AI and teaching. Cognitive offloading, chatbot sycophancy, & more! intentionalteaching.buzzsprout.com/2069949/epis...
August 19, 2025 at 5:22 PM
Democratizing prompt for LLMs:

Read and review new terms of service for X company. Compare and contrast with previous versions. What should I be aware of? What might any consumer be wary of or concerned about?
August 17, 2025 at 3:09 PM
I love this work. Biggest takeaways - AI Agents "often fail at effectively guiding students toward mastery" and "students prioritize scores over feedback, leading to off-task behavior that can hinder growth." Authenticity and relatedness are needed for these efforts. More humans in the process.
August 5, 2025 at 4:35 PM
As models become tuned to specialized academic content, the gap between novice and expert ability to critically evaluate outputs will grow. This paper demonstrates the change in student error recognition from GPT4 to o3. More longitudinal studies like this please. arxiv.org/abs/2507.20995
VArsity: Can Large Language Models Keep Power Engineering Students in Phase?
This paper provides an educational case study regarding our experience in deploying ChatGPT Large Language Models (LLMs) in the Spring 2025 and Fall 2023 offerings of ECE 4320: Power System Analysis a...
arxiv.org
August 1, 2025 at 4:55 PM
It was relatively easy to keep up with the frontier AI models and a few open source clones. Agents are like electrical appliance manufacturers in the early twentieth century. What is actually useful? What is a niche product?
arxiv.org/abs/2503.11733
LLM Agents for Education: Advances and Applications
Large Language Model (LLM) agents have demonstrated remarkable capabilities in automating tasks and driving innovation across diverse educational applications. In this survey, we provide a systematic ...
arxiv.org
July 17, 2025 at 3:56 PM
I've been focusing a lot on recently on trust and perceptions of trust + sycophancy + bot capability. Appreciate this preliminary approach for gauging Uni-owned bots vs Chat. arxiv.org/abs/2505.10490.
Campus AI vs Commercial AI: A Late-Breaking Study on How LLM As-A-Service Customizations Shape Trust and Usage Patterns
As the use of Large Language Models (LLMs) by students, lecturers and researchers becomes more prevalent, universities - like other organizations - are pressed to develop coherent AI strategies. LLMs ...
arxiv.org
July 15, 2025 at 7:03 PM
Increasingly convinced that simulation practice in clinical settings is one of the biggest "killer app" prospects for AI in education. Generate at scale, add nuance + complexity easily, personalize, etc.
iovs.arvojournals.org/article.aspx...
A Large Language Model-Based Digital Twin Patient System Enhances Clinical Questioning Skills in Medical Education: A Randomized Controlled Trial | IOVS | ARVO Journals
iovs.arvojournals.org
July 10, 2025 at 2:48 PM
I ran a small experiment with two syllabus chatbots - a Ted Lasso overly sycophantic version and a caustic Severus Snape version. Students really didn't like Snape's recalcitrance, but also hated Lasso's bland fawning style. We need more work on sycophancy and its effects
arxiv.org/abs/2311.09410
When Large Language Models contradict humans? Large Language Models' Sycophantic Behaviour
Large Language Models have been demonstrating broadly satisfactory generative abilities for users, which seems to be due to the intensive use of human feedback that refines responses. Nevertheless, su...
arxiv.org
July 9, 2025 at 5:29 PM
He shared at The Other Place, but I have to say here that @hamel.bsky.social 's YouTube chapter generator code is a godsend to academics who want to parse through hour-long video talks, especially when they concern AI and many are saying the same things. Link in comments
July 6, 2025 at 11:40 PM
This is solid math education work. "LLMs have mastered a superficial solution process but do not make sense of word problems." That superficiality and illusion of explanatory depth is one of the biggest dangers to student users who might think 'seems legit'.

arxiv.org/abs/2506.24006
Large Language Models Don't Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective
The progress of Large Language Models (LLMs) like ChatGPT raises the question of how they can be integrated into education. One hope is that they can support mathematics learning, including word-probl...
arxiv.org
July 1, 2025 at 4:32 PM
We've got decent work on sycophancy. This paper gets at the heart of the student use peril. Overconfidence in the tool AND yourself will redirect the output towards your conclusion. Makes it easier to dismiss the friction of error, and easier to outsource your thinking work. arxiv.org/abs/2506.10297
"Check My Work?": Measuring Sycophancy in a Simulated Educational Context
This study examines how user-provided suggestions affect Large Language Models (LLMs) in a simulated educational context, where sycophancy poses significant risks. Testing five different LLMs from the...
arxiv.org
June 24, 2025 at 2:50 AM
Purdue professors in the late 60s were apparently either Mary Poppins or Sam Kinison
June 20, 2025 at 1:36 PM
This has been poked at in lots of places, so I won't, but to me it demonstrates the pitfalls of AI educational research when computer scientists omit learning scientists from their work, and vice versa.
arxiv.org/abs/2506.08872
Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task
This study explores the neural and behavioral consequences of LLM-assisted essay writing. Participants were divided into three groups: LLM, Search Engine, and Brain-only (no tools). Each completed thr...
arxiv.org
June 20, 2025 at 2:39 AM
Reposted by David Nelson
AI education is way too focused on LLMs and DNNs these days, excluding a lot useful topics in AI that are non-ML (e.g. search, planning, knowledge, etc.). That's one of the motivations behind this series I started, www.informationga.in/blog/ai-expl..., which is a first principles treatment of AI.
information gain
A place to gain a dose of AI information.
www.informationga.in
June 19, 2025 at 6:12 PM
Beyond LLM metrics or any gauge of effectiveness, the faculty I have worked with vibe first with the voice, syntax and prosody trends of each bot.

DeepSeek spurred smiles and curiosity among many colleagues.

Big hat tip to: @natolambert.bsky.social

www.interconnects.ai/p/latest-ope...
The latest open artifacts (#10): More permissive licenses, everything as a reasoner, and from artifacts to agents
Artifacts Log 10.
www.interconnects.ai
May 29, 2025 at 5:44 PM
Would like more studies like this that quantify the impact of active AI use on exams. While this is only part of the puzzle, I think, modifying learning outcomes beyond disciplinary recall benefits from detailed study of how gameable current assessments are. www.sciencedirect.com/science/arti...
Generative AI in Graduate Bioprocess Engineering Exams: Is Attention All Students Need?
State-of-the-art large language models (LLMs) can now answer conceptual textbook questions with near-perfect accuracy and perform complex equation der…
www.sciencedirect.com
May 29, 2025 at 4:21 PM
What I really appreciate about this study is the confirmation that those who value human dialogue in learning are more tuned to artificial efforts, and those who don't think of feedback largely in terms of clarity, not content. link.springer.com/chapter/10.1...
AI or Human? Evaluating Student Feedback Perceptions in Higher Education
Feedback plays a crucial role in learning by helping individuals understand and improve their performance. Yet, providing timely, personalized feedback in higher education presents a challenge due to ...
link.springer.com
May 9, 2025 at 8:00 PM
In 25 years of teaching, I have never once felt sad nor a sense of loss when a class ended. The students in my AI course have activated both feelings in me. I will truly miss their energy, insights and critical thought about learning in the AI era.
May 2, 2025 at 7:54 PM