Lupino
lupinoarts.mstdn.social.ap.brid.gy
Lupino
@lupinoarts.mstdn.social.ap.brid.gy
Software Developer, Ex-Linguist, Something with creativity. Currently working on getting LaTeX output accessible.

[bridged from https://mstdn.social/@LupinoArts on the fediverse by https://fed.brid.gy/ ]
At best, Large Language Models are an implementation of Wernicke's Aphasia. #hallucination #AI
January 17, 2026 at 9:41 AM
There is hardly anything more Canadian than a speaker who says "thank you" every single time the audience applauds #39c3
December 28, 2025 at 1:04 PM
Wie immer sehr sehenswerter und wichtiger Vortrag von @kattascha auf dem #39c3
December 27, 2025 at 9:55 PM
Reposted by Lupino
Laut Oxfam würde es jährlich 37 Milliarden Dollar kosten, um sowohl den akuten, als auch den chronischen Hunger auf der ganzen Welt zu beseitigen. Elon Musk allein hätte im Moment die Mittel, um das zwanzig Jahre lang zu tun. Nur um mal die Relationen zu klären.
December 23, 2025 at 11:12 AM
Warum gibt es "obigem" aber nicht "untigem"...?
December 5, 2025 at 1:06 PM
Die Linken sind zu brav. Warum nicht mal offen zur Jagd auf Reiche blasen? Ist ja nicht so, als hätten diese keine Wahl; sie müssten nur alle ihre Assets verkaufen, die Erlöse spenden, und schon wären sie raus aus der Zielgruppe...
November 22, 2025 at 3:08 PM
Dangit, here we go again with the modding and waiting for updates after updates and the countless hours... #ksp has a spiritual successor: https://ahwoo.com/store/KPbAA1Au/kitten-space-agency #ksa
November 16, 2025 at 10:39 AM
Reposted by Lupino
Hey #Fedieltern ich suche zur Klett Anlauttabelle „Deutschrad“ die HINTERSeite. Das hier ist die Vorderseite. Hat jemand die Rückseite und könnte mir ein Bild davon per DM schicken? Es muss diese vom Klett Verlag sein, nicht Zebra etc. - danke fürs Teilen! 🙏
November 16, 2025 at 9:10 AM
Great. The next version of the #pac (PDF #Accessibility Checker) is going to be "enhanced" with AI slob… I guess, that only leaves us with VeraPDF… #texlatex

https://pac.pdf-accessibility.org/de/ressourcen/ki-unterstuetzte-pruefungen
KI-unterstützte Prüfungen
pac.pdf-accessibility.org
November 3, 2025 at 11:48 AM
Reposted by Lupino
The translation of "just use our docker image" into human language is "we don't have good docs and we don't give a shit".
November 2, 2025 at 7:06 PM
Reposted by Lupino
Weiß gar nicht, ob ichs hier gepostet hatte:
Ich wurde zu Herausforderungen für Hochschulen durch "KI" interviewed

https://www.youtube.com/watch?v=K4p5wllxrR4
October 6, 2025 at 8:21 AM
Reposted by Lupino
October 3, 2025 at 2:03 PM
Sometimes, I'm horrified what my labour helps to create... #aislop
September 27, 2025 at 12:27 PM
The #Ruby world is in such deep shit right now, it's a shame. And all because of one xenophobic dickhead...
September 27, 2025 at 8:36 AM
Weiß zufällig jemand, ab wann man Tickets für den #39c3 kriegen kann? Wegen Urlaubsplanung und so…
September 25, 2025 at 1:34 PM
Reposted by Lupino
John Oliver hat 10 Minuten über #Bernd das #brot gemacht und erstaunliche charakterliche Übereinstimmungen festgestellt. 😂
[Video] Original post on berlin.social
berlin.social
September 22, 2025 at 12:44 PM
Eigentlich nie der falsche Zeitpunkt, um diesen Klassiker zu teilen, aber jetzt ganz besonders: https://www.youtube.com/watch?v=sApHLL9R67Y&rco=1
September 22, 2025 at 11:49 AM
Aus unserer beliebten Reihe "no shit, Sherlock": "OpenAI admits AI hallucinations are mathematically inevitable, not just engineering flaws" https://www.computerworld.com/article/4059383/openai-admits-ai-hallucinations-are-mathematically-inevitable-not-just-engineering-flaws.html
OpenAI admits AI hallucinations are mathematically inevitable, not just engineering flaws
OpenAI, the creator of ChatGPT, acknowledged in its own research that large language models will always produce hallucinations due to fundamental mathematical constraints that cannot be solved through better engineering, marking a significant admission from one of the AI industry’s leading companies. The study, published on September 4 and led by OpenAI researchers Adam Tauman Kalai, Edwin Zhang, and Ofir Nachum alongside Georgia Tech’s Santosh S. Vempala, provided a comprehensive mathematical framework explaining why AI systems must generate plausible but false information even when trained on perfect data. ##### **[ Related:****More OpenAI news and insights****]** “Like students facing hard exam questions, large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty,” the researchers wrote in the paper. “Such ‘hallucinations’ persist even in state-of-the-art systems and undermine trust.” The admission carried particular weight given OpenAI’s position as the creator of ChatGPT, which sparked the current AI boom and convinced millions of users and enterprises to adopt generative AI technology. ## OpenAI’s own models failed basic tests The researchers demonstrated that hallucinations stemmed from statistical properties of language model training rather than implementation flaws. The study established that “the generative error rate is at least twice the IIV misclassification rate,” where IIV referred to “Is-It-Valid” and demonstrated mathematical lower bounds that prove AI systems will always make a certain percentage of mistakes, no matter how much the technology improves. The researchers demonstrated their findings using state-of-the-art models, including those from OpenAI’s competitors. When asked “How many Ds are in DEEPSEEK?” the DeepSeek-V3 model with 600 billion parameters “returned ‘2’ or ‘3’ in ten independent trials” while Meta AI and Claude 3.7 Sonnet performed similarly, “including answers as large as ‘6’ and ‘7.’” OpenAI also acknowledged the persistence of the problem in its own systems. The company stated in the paper that “ChatGPT also hallucinates. GPT‑5 has significantly fewer hallucinations, especially when reasoning, but they still occur. Hallucinations remain a fundamental challenge for all large language models.” OpenAI’s own advanced reasoning models actually hallucinated more frequently than simpler systems. The company’s o1 reasoning model “hallucinated 16 percent of the time” when summarizing public information, while newer models o3 and o4-mini “hallucinated 33 percent and 48 percent of the time, respectively.” “Unlike human intelligence, it lacks the humility to acknowledge uncertainty,” said Neil Shah, VP for research and partner at Counterpoint Technologies. “When unsure, it doesn’t defer to deeper research or human oversight; instead, it often presents estimates as facts.” The OpenAI research identified three mathematical factors that made hallucinations inevitable: epistemic uncertainty when information appeared rarely in training data, model limitations where tasks exceeded current architectures’ representational capacity, and computational intractability where even superintelligent systems could not solve cryptographically hard problems. ## Industry evaluation methods made the problem worse Beyond proving hallucinations were inevitable, the OpenAI research revealed that industry evaluation methods actively encouraged the problem. Analysis of popular benchmarks, including GPQA, MMLU-Pro, and SWE-bench, found nine out of 10 major evaluations used binary grading that penalized “I don’t know” responses while rewarding incorrect but confident answers. “We argue that language models hallucinate because the training and evaluation procedures reward guessing over acknowledging uncertainty,” the researchers wrote. Charlie Dai, VP and principal analyst at Forrester, said enterprises already faced challenges with this dynamic in production deployments. ‘Clients increasingly struggle with model quality challenges in production, especially in regulated sectors like finance and healthcare,’ Dai told Computerworld. The research proposed “explicit confidence targets” as a solution, but acknowledged that fundamental mathematical constraints meant complete elimination of hallucinations remained impossible. ## Enterprises must adapt strategies Experts believed the mathematical inevitability of AI errors demands new enterprise strategies. “Governance must shift from prevention to risk containment,” Dai said. “This means stronger human-in-the-loop processes, domain-specific guardrails, and continuous monitoring.” Current AI risk frameworks have proved inadequate for the reality of persistent hallucinations. “Current frameworks often underweight epistemic uncertainty, so updates are needed to address systemic unpredictability,” Dai added. Shah advocated for industry-wide evaluation reforms similar to automotive safety standards. “Just as automotive components are graded under ASIL standards to ensure safety, AI models should be assigned dynamic grades, nationally and internationally, based on their reliability and risk profile,” he said. Both analysts agreed that vendor selection criteria needed fundamental revision. “Enterprises should prioritize calibrated confidence and transparency over raw benchmark scores,” Dai said. “AI leaders should look for vendors that provide uncertainty estimates, robust evaluation beyond standard benchmarks, and real-world validation.” Shah suggested developing “a real-time trust index, a dynamic scoring system that evaluates model outputs based on prompt ambiguity, contextual understanding, and source quality.” ## Market already adapting These enterprise concerns aligned with broader academic findings. A Harvard Kennedy School research found that “downstream gatekeeping struggles to filter subtle hallucinations due to budget, volume, ambiguity, and context sensitivity concerns.” Dai noted that reforming evaluation standards faced significant obstacles. “Reforming mainstream benchmarks is challenging. It’s only feasible if it’s driven by regulatory pressure, enterprise demand, and competitive differentiation.” The OpenAI researchers concluded that their findings required industry-wide changes to evaluation methods. “This change may steer the field toward more trustworthy AI systems,” they wrote, while acknowledging that their research proved some level of unreliability would persist regardless of technical improvements. For enterprises, the message appeared clear: AI hallucinations represented not a temporary engineering challenge, but a permanent mathematical reality requiring new governance frameworks and risk management strategies. More on AI hallucinations: * You thought genAI hallucinations were bad? Things just got so much worse * Microsoft claims new ‘Correction’ tool can fix genAI hallucinations * AI hallucination mitigation: two brains are better than one
www.computerworld.com
September 22, 2025 at 9:02 AM
Erinnert mich daran, dass wir alle politische Morde verurteilen, wenn das nächste mal der 20. Juli ist...
September 13, 2025 at 2:21 AM
Ich glaub, der Youtube-Kanal von ARTE wurde gehackt...
September 8, 2025 at 6:32 AM
#Leipzig ist ein Klein-#Paris, aber trotzdem sind dessen Strassen und Plätzer immer noch versiegelt...
September 4, 2025 at 4:25 PM
Reposted by Lupino
Germanistikpower:
Angsthase, Naschkatze, Schnapsdrossel, Frechdachs, Spaßvogel, Schluckspecht.

Gibt es Arbeiten zu solchen Nominalisierungsmustern mit Tierbezeichnungen?
September 2, 2025 at 9:18 AM
Jajaja, Deutschland ist "nur" für 2% des weltweiten CO2 Austoßes verantwortlich, aber wie viele Autos Deutscher Hersteller verpesten weltweit die Luft? Wie viele fossile Kraftwerke erzeugen Strom mit Gasturbinen von Siemens...?
August 27, 2025 at 7:30 AM