Jeroen Mahieu
jmahieu.bsky.social
Jeroen Mahieu
@jmahieu.bsky.social
Assistant professor at Utrecht University School of Economics || Economics of Entrepreneurship
The big black box for us as teachers remains how students are actually using LLMs and how we can help them use LLMs in a way that helps them *learn*. I currently see very few efforts in this direction. Tools like alldayta.com are a first step but not the best solution for this problem imo
All Day TA
All Day TA is an AI EdTech company focused on higher education that enables professors to build customized AI teaching assistants for their courses. Available 24/7, it provides students with instant, ...
alldayta.com
January 28, 2025 at 9:31 PM
Because of this lack of expert knowledge, learning *from* LLMs and learning *how* to generate high-quality with LLMs is impossible. Students remain trapped in mediocre text that may *look* good but is a “sufficient” grade at best
January 28, 2025 at 9:31 PM
Students lack expert knowledge which is complementary to LLM output. Without such knowledge it is very hard to direct LLMs to consistently produce output that is better than mediocre. You see this very clearly in theory sections that require careful logical argumentation and “connecting the dots”
January 28, 2025 at 9:31 PM
Student effort going down. LLMs typically produce text that looks good on first sight and might trick a non-expert into thinking they can do the job with less effort from their side. However, actual quality of such first attempts is most of the time mediocre at best
January 28, 2025 at 9:31 PM
However, I see little improvement among the mediocre and good proposals despite > 2 years since ChatGPT launch and model improvements. My guess this is due to different reasons:
January 28, 2025 at 9:31 PM
The quality of the worst proposals has improved, mainly in terms of writing. Nobody submits terrible text anymore. My standards on this also have increased; submitting text with grammar or spelling mistakes is not done and will be penalised. 0 cost to write w/o grammar mistakes, students know this
January 28, 2025 at 9:31 PM
Reposted by Jeroen Mahieu
That’s a good article, but the figures quoted for ChatGPT vs. Google search are outdated, wrong & too high.

Best current energy estimate for a day of ChatGPT use is equiv. to driving an average car the length of a tennis court:

engineeringprompts.substack.com/p/does-chatg...
Does ChatGPT use 10x more energy than a standard Google search?
A journey down the rabbit hole of viral AI energy claims. It's probably true in relative terms, but that's not what matters.
engineeringprompts.substack.com
January 19, 2025 at 5:00 PM
Before, I would never have even considered searching, hiring and training a TA (or two) for this project given it is too small and there is no funding. Now it costs me 10 dollars for the API and three hours to debug the code myself + some minor manual cleaning for the “special” cases. Bonkers
January 17, 2025 at 11:13 PM
Eg: in less than half a day, I had GPT-4o code and run a script to extract and interpret text from unstructured scanned company pdfs in Dutch and French & return structured data based on the information from the docs (“give the gender of all the founders of the firm”). Would have taken RA weeks
January 17, 2025 at 11:13 PM
To conclude: one effect being “significant” and the other “not significant” is rarely enough to conclude a meaningful difference of differences.
January 1, 2025 at 1:02 PM