Lightnews — Scholar-powered news

Scott B

@scottbartlett.bsky.social

AI’s Impact on SEO: 13 Things That Changed, 4 Things That Stayed The Same #AISearch

AI’s Impact on SEO: 13 Things That Changed, 4 Things That Stayed The Same

This shift happened fast. In 2024, AI Overviews rolled out to millions of searches, ChatGPT climbed into the top ranks of global websites, and the once-reliable #1 Google spot began losing a third of its clicks. Suddenly, SEO wasn’t just…Read more ›

sdbart.co

November 11, 2025 at 6:04 PM

Scott B

@scottbartlett.bsky.social

New Data: 67% of Professionals See AI as a Near-Term or Immediate Job Threat

Our latest AI Pulse survey, taken by listeners of The Artificial Intelligence Show, shows a significant majority of professionals (two-thirds) view AI as an immediate or near-term career threat, even as they rapidly integrate it into their daily workflows.

sdbart.co

November 11, 2025 at 5:30 PM

Scott B

@scottbartlett.bsky.social

Google’s New AI Plan to Reinvent the Classroom

Google has published a comprehensive paper detailing its vision for integrating artificial intelligence into global education, outlining how AI could tackle everything from declining academic outcomes to systemic teacher shortages.

sdbart.co

November 11, 2025 at 2:30 PM

Scott B

@scottbartlett.bsky.social

#Business #BusinessArtificialIntelligence The Former Staffer Calling Out OpenAI’s Erotica Claims

The Former Staffer Calling Out OpenAI’s Erotica Claims

Steven Adler used to lead product safety at OpenAI. On this week’s episode of The Big Interview, he talks about what AI users should know about their bots.

sdbart.co

November 11, 2025 at 1:45 PM

Scott B

@scottbartlett.bsky.social

#AI Meta returns to open source AI with Omnilingual ASR models that can transcribe 1,600+ languages natively

Meta returns to open source AI with Omnilingual ASR models that can transcribe 1,600+ languages natively

Meta has just released a new multilingual automatic speech recognition (ASR) system supporting 1,600+ languages — dwarfing OpenAI’s open source Whisper model, which supports just 99. Is architecture also allows developers to extend that support to thousands more. Through a feature called zero-shot in-context learning, users can provide a few paired examples of audio and text in a new language at inference time, enabling the model to transcribe additional utterances in that language without any retraining. In practice, this expands potential coverage to more than 5,400 languages — roughly every spoken language with a known script. It’s a shift from static model capabilities to a flexible framework that communities can adapt themselves. So while the 1,600 languages reflect official training coverage, the broader figure represents Omnilingual ASR’s capacity to generalize on demand, making it the most extensible speech recognition system released to date. Best of all: it's been open sourced under a plain Apache 2.0 license — not a restrictive, quasi open-source Llama license like the company's prior releases, which limited use by larger enterprises unless they paid licensing fees — meaning researchers and developers are free to take and implement it right away, for free, without restrictions, even in commercial and enterprise-grade projects! Released on November 10 on Meta's website, Github, along with a demo space on Hugging Face and technical paper, Meta’s Omnilingual ASR suite includes a family of speech recognition models, a 7-billion parameter multilingual audio representation model, and a massive speech corpus spanning over 350 previously underserved languages. All resources are freely available under open licenses, and the models support speech-to-text transcription out of the box. “By open sourcing these models and dataset, we aim to break down language barriers, expand digital access, and empower communities worldwide,” Meta posted on its @AIatMeta account on X Designed for Speech-to-Text Transcription At its core, Omnilingual ASR is a speech-to-text system. The models are trained to convert spoken language into written text, supporting applications like voice assistants, transcription tools, subtitles, oral archive digitization, and accessibility features for low-resource languages. Unlike earlier ASR models that required extensive labeled training data, Omnilingual ASR includes a zero-shot variant. This version can transcribe languages it has never seen before—using just a few paired examples of audio and corresponding text. This lowers the barrier for adding new or endangered languages dramatically, removing the need for large corpora or retraining. Model Family and Technical Design The Omnilingual ASR suite includes multiple model families trained on more than 4.3 million hours of audio from 1,600+ languages: * wav2vec 2.0 models for self-supervised speech representation learning (300M–7B parameters) * CTC-based ASR models for efficient supervised transcription * LLM-ASR models combining a speech encoder with a Transformer-based text decoder for state-of-the-art transcription * LLM-ZeroShot ASR model, enabling inference-time adaptation to unseen languages All models follow an encoder–decoder design: raw audio is converted into a language-agnostic representation, then decoded into written text. Why the Scale Matters While Whisper and similar models have advanced ASR capabilities for global languages, they fall short on the long tail of human linguistic diversity. Whisper supports 99 languages. Meta’s system: * Directly supports 1,600+ languages * Can generalize to 5,400+ languages using in-context learning * Achieves character error rates (CER) under 10% in 78% of supported languages Among those supported are more than 500 languages never previously covered by any ASR model, according to Meta’s research paper. This expansion opens new possibilities for communities whose languages are often excluded from digital tools Here’s the revised and expanded background section, integrating the broader context of Meta’s 2025 AI strategy, leadership changes, and Llama 4’s reception, complete with in-text citations and links: Background: Meta’s AI Overhaul and a Rebound from Llama 4 The release of Omnilingual ASR arrives at a pivotal moment in Meta’s AI strategy, following a year marked by organizational turbulence, leadership changes, and uneven product execution. Omnilingual ASR is the first major open-source model release since the rollout of Llama 4, Meta’s latest large language model, which debuted in April 2025 to mixed and ultimately poor reviews, with scant enterprise adoption compared to Chinese open source model competitors. The failure led Meta founder and CEO Mark Zuckerberg to appoint Alexandr Wang, co-founder and prior CEO of AI data supplier Scale AI, as Chief AI Officer, and embark on an extensive and costly hiring spree that shocked the AI and business communities with eye-watering pay packages for top AI researchers. In contrast, Omnilingual ASR represents a strategic and reputational reset. It returns Meta to a domain where the company has historically led — multilingual AI — and offers a truly extensible, community-oriented stack with minimal barriers to entry. The system’s support for 1,600+ languages and its extensibility to over 5,000 more via zero-shot in-context learning reassert Meta’s engineering credibility in language technology. Importantly, it does so through a free and permissively licensed release, under Apache 2.0, with transparent dataset sourcing and reproducible training protocols. This shift aligns with broader themes in Meta’s 2025 strategy. The company has refocused its narrative around a “personal superintelligence” vision, investing heavily in infrastructure (including a September release of custom AI accelerators and Arm-based inference stacks) source while downplaying the metaverse in favor of foundational AI capabilities. The return to public training data in Europe after a regulatory pause also underscores its intention to compete globally, despite privacy scrutiny source. Omnilingual ASR, then, is more than a model release — it’s a calculated move to reassert control of the narrative: from the fragmented rollout of Llama 4 to a high-utility, research-grounded contribution that aligns with Meta’s long-term AI platform strategy. Community-Centered Dataset Collection To achieve this scale, Meta partnered with researchers and community organizations in Africa, Asia, and elsewhere to create the Omnilingual ASR Corpus, a 3,350-hour dataset across 348 low-resource languages. Contributors were compensated local speakers, and recordings were gathered in collaboration with groups like: * African Next Voices: A Gates Foundation–supported consortium including Maseno University (Kenya), University of Pretoria, and Data Science Nigeria * Mozilla Foundation’s Common Voice, supported through the Open Multilingual Speech Fund * Lanfrica / NaijaVoices, which created data for 11 African languages including Igala, Serer, and Urhobo The data collection focused on natural, unscripted speech. Prompts were designed to be culturally relevant and open-ended, such as “Is it better to have a few close friends or many casual acquaintances? Why?” Transcriptions used established writing systems, with quality assurance built into every step. Performance and Hardware Considerations The largest model in the suite, the omniASR_LLM_7B, requires ~17GB of GPU memory for inference, making it suitable for deployment on high-end hardware. Smaller models (300M–1B) can run on lower-power devices and deliver real-time transcription speeds. Performance benchmarks show strong results even in low-resource scenarios: * CER

sdbart.co

November 11, 2025 at 1:00 PM

Scott B

@scottbartlett.bsky.social

The Federal Reserve Finally Admits It: AI Is Impacting the Job Market

Federal Reserve Chair Jerome Powell just delivered a sobering warning: The US labor market is slowing down, and AI is a key reason why.

sdbart.co

November 10, 2025 at 2:30 PM

Scott B

@scottbartlett.bsky.social

#AI Terminal-Bench 2.0 launches alongside Harbor, a new framework for testing agents in containers

Terminal-Bench 2.0 launches alongside Harbor, a new framework for testing agents in containers

The developers of Terminal-Bench, a benchmark suite for evaluating the performance of autonomous AI agents on real-world terminal-based tasks, have released version 2.0 alongside Harbor, a new framework for testing, improving and optimizing AI agents in containerized environments. The dual release aims to address long-standing pain points in testing and optimizing AI agents, particularly those built to operate autonomously in realistic developer environments. With a more difficult and rigorously verified task set, Terminal-Bench 2.0 replaces version 1.0 as the standard for assessing frontier model capabilities. Harbor, the accompanying runtime framework, enables developers and researchers to scale evaluations across thousands of cloud containers and integrates with both open-source and proprietary agents and training pipelines. “Harbor is the package we wish we had had while making Terminal-Bench," wrote co-creator Alex Shaw on X. "It’s for agent, model, and benchmark developers and researchers who want to evaluate and improve agents and models." Higher Bar, Cleaner Data Terminal-Bench 1.0 saw rapid adoption after its release in May 2025, becoming a default benchmark for evaluating agent performance across the field of AI-powered agents operating in developer-style terminal environments. These agents interact with systems through the command line, mimicking how developers work behind the scenes of the graphical user interface. However, its broad scope came with inconsistencies. Several tasks were identified by the community as poorly specified or unstable due to external service changes. Version 2.0 addresses those issues directly. The updated suite includes 89 tasks, each subjected to several hours of manual and LLM-assisted validation. The emphasis is on making tasks solvable, realistic, and clearly specified, raising the difficulty ceiling while improving reliability and reproducibility. A notable example is the download-youtube task, which was removed or refactored in 2.0 due to its dependence on unstable third-party APIs. “Astute Terminal-Bench fans may notice that SOTA performance is comparable to TB1.0 despite our claim that TB2.0 is harder,” Shaw noted on X. “We believe this is because task quality is substantially higher in the new benchmark.” Harbor: Unified Rollouts at Scale Alongside the benchmark update, the team launched Harbor, a new framework for running and evaluating agents in cloud-deployed containers. Harbor supports large-scale rollout infrastructure, with compatibility for major providers like Daytona and Modal. Designed to generalize across agent architectures, Harbor supports: * Evaluation of any container-installable agent * Scalable supervised fine-tuning (SFT) and reinforcement learning (RL) pipelines * Custom benchmark creation and deployment * Full integration with Terminal-Bench 2. Harbor was used internally to run tens of thousands of rollouts during the creation of the new benchmark. It is now publicly available via harborframework.com, with documentation for testing and submitting agents to the public leaderboard. Early Results: GPT-5 Leads in Task Success Initial results from the Terminal-Bench 2.0 leaderboard show OpenAI's Codex CLI (command line interface), a GPT-5 powered variant, in the lead, with a 49.6% success rate — the highest among all agents tested so far. Close behind are other GPT-5 variants and Claude Sonnet 4.5-based agents. Top 5 Agent Results (Terminal-Bench 2.0): * Codex CLI (GPT-5) — 49.6% * Codex CLI (GPT-5-Codex) — 44.3% * OpenHands (GPT-5) — 43.8% * Terminus 2 (GPT-5-Codex) — 43.4% * Terminus 2 (Claude Sonnet 4.5) — 42.8% The close clustering among top models indicates active competition across platforms, with no single agent solving more than half the tasks. Submission and Use To test or submit an agent, users install Harbor and run the benchmark using simple CLI commands. Submissions to the leaderboard require five benchmark runs, and results can be emailed to the developers along with job directories for validation. harbor run -d terminal-bench@2.0 -m "" -a "" --n-attempts 5 --jobs-dir Terminal-Bench 2.0 is already being integrated into research workflows focused on agentic reasoning, code generation, and tool use. According to co-creator Mike Merrill, a postdoctoral researcher at Stanford, a detailed preprint is in progress covering the verification process and design methodology behind the benchmark. Aiming for Standardization The combined release of Terminal-Bench 2.0 and Harbor marks a step toward more consistent and scalable agent evaluation infrastructure. As LLM agents proliferate in developer and operational environments, the need for controlled, reproducible testing has grown. These tools offer a potential foundation for a unified evaluation stack — supporting model improvement, environment simulation, and benchmark standardization across the AI ecosystem.

sdbart.co

November 10, 2025 at 1:00 PM

Scott B

@scottbartlett.bsky.social

#DataInfrastructure #AI Ship fast, optimize later: Top AI engineers don't care about cost — they're prioritizing deployment

Ship fast, optimize later: Top AI engineers don't care about cost — they're prioritizing deployment

Across industries, rising compute expenses are often cited as a barrier to AI adoption — but leading companies are finding that cost is no longer the real constraint. The tougher challenges (and the ones top of mind for many tech leaders)? Latency, flexibility and capacity. At Wonder, for instance, AI adds a mere few centers per order; the food delivery and takeout company is much more concerned with cloud capacity with skyrocketing demands. Recursion, for its part, has been focused on balancing small and larger-scale training and deployment via on-premises clusters and the cloud; this has afforded the biotech company flexibility for rapid experimentation. The companies’ true in-the-wild experiences highlight a broader industry trend: For enterprises operating AI at scale, economics aren't the key decisive factor — the conversation has shifted from how to pay for AI to how fast it can be deployed and sustained. AI leaders from the two companies recently sat down with Venturebeat’s CEO and editor-in-chief Matt Marshall as part of VB’s traveling AI Impact Series. Here’s what they shared. Wonder: Rethink what you assume about capacity Wonder uses AI to power everything from recommendations to logistics — yet, as of now, reported CTO James Chen, AI adds just a few cents per order. Chen explained that the technology component of a meal order costs 14 cents, the AI 2 to 3 cents, although that’s “going up really rapidly” to 5 to 8 cents. Still, that seems almost immaterial compared to total operating costs. Instead, the 100% cloud-native AI company’s main concern has been capacity with growing demand. Wonder was built with “the assumption” (which proved to be incorrect) that there would be “unlimited capacity” so they could move “super fast” and wouldn’t have to worry about managing infrastructure, Chen noted. But the company has grown quite a bit over the last few years, he said; as a result, about six months ago, “we started getting little signals from the cloud providers, ‘Hey, you might need to consider going to region two,’” because they were running out of capacity for CPU or data storage at their facilities as demand grew. It was “very shocking” that they had to move to plan B earlier than they anticipated. “Obviously it's good practice to be multi-region, but we were thinking maybe two more years down the road,” said Chen. What's not economically feasible (yet) Wonder built its own model to maximize its conversion rate, Chen noted; the goal is to surface new restaurants to relevant customers as much as possible. These are “isolated scenarios” where models are trained over time to be “very, very efficient and very fast.” Currently, the best bet for Wonder’s use case is large models, Chen noted. But in the long term, they’d like to move to small models that are hyper-customized to individuals (via AI agents or concierges) based on their purchase history and even their clickstream. “Having these micro models is definitely the best, but right now the cost is very expensive,” Chen noted. “If you try to create one for each person, it's just not economically feasible.” Budgeting is an art, not a science Wonder gives its devs and data scientists as much playroom as possible to experiment, and internal teams review the costs of use to make sure nobody turned on a model and “jacked up massive compute around a huge bill,” said Chen. The company is trying different things to offload to AI and operate within margins. “But then it's very hard to budget because you have no idea,” he said. One of the challenging things is the pace of development; when a new model comes out, “we can’t just sit there, right? We have to use it.” Budgeting for the unknown economics of a token-based system is “definitely art versus science.” A critical component in the software development lifecycle is preserving context when using large native models, he explained. When you find something that works, you can add it to your company’s “corpus of context” that can be sent with every request. That’s big and it costs money each time. “Over 50%, up to 80% of your costs is just resending the same information back into the same engine again on every request,” said Chen. In theory, the more they do should require less cost per unit. “I know when a transaction happens, I'll pay the X cent tax for each one, but I don't want to be limited to use the technology for all these other creative ideas." The 'vindication moment' for Recursion Recursion, for its part, has focused on meeting broad-ranging compute needs via a hybrid infrastructure of on-premise clusters and cloud inference. When initially looking to build out its AI infrastructure, the company had to go with its own setup, as “the cloud providers didn't have very many good offerings,” explained CTO Ben Mabey. “The vindication moment was that we needed more compute and we looked to the cloud providers and they were like, ‘Maybe in a year or so.’” The company’s first cluster in 2017 incorporated Nvidia gaming GPUs (1080s, launched in 2016); they have since added Nvidia H100s and A100s, and use a Kubernetes cluster that they run in the cloud or on-prem. Addressing the longevity question, Mabey noted: “These gaming GPUs are actually still being used today, which is crazy, right? The myth that a GPU's life span is only three years, that's definitely not the case. A100s are still top of the list, they're the workhorse of the industry.” Best use cases on-prem vs cloud; cost differences More recently, Mabey’s team has been training a foundation model on Recursion’s image repository (which consists of petabytes of data and more than 200 pictures). This and other types of big training jobs have required a “massive cluster” and connected, multi-node setups. “When we need that fully-connected network and access to a lot of our data in a high parallel file system, we go on-prem,” he explained. On the other hand, shorter workloads run in the cloud. Recursion’s method is to “pre-empt” GPUs and Google tensor processing units (TPUs), which is the process of interrupting running GPU tasks to work on higher-priority ones. “Because we don't care about the speed in some of these inference workloads where we're uploading biological data, whether that's an image or sequencing data, DNA data,” Mabey explained. “We can say, ‘Give this to us in an hour,’ and we're fine if it kills the job.” From a cost perspective, moving large workloads on-prem is “conservatively” 10 times cheaper, Mabey noted; for a five year TCO, it's half the cost. On the other hand, for smaller storage needs, the cloud can be “pretty competitive” cost-wise. Ultimately, Mabey urged tech leaders to step back and determine whether they’re truly willing to commit to AI; cost-effective solutions typically require multi-year buy-ins. “From a psychological perspective, I've seen peers of ours who will not invest in compute, and as a result they're always paying on demand," said Mabey. "Their teams use far less compute because they don't want to run up the cloud bill. Innovation really gets hampered by people not wanting to burn money.”

sdbart.co

November 7, 2025 at 10:30 PM

Scott B

@scottbartlett.bsky.social

New Report: 75 Percent of Companies Surveyed Already See Positive ROI from Generative AI

Companies are getting results with generative AI, according to a new report by The Wharton School of the University of Pennsylvania.

sdbart.co

November 7, 2025 at 5:30 PM

Scott B

@scottbartlett.bsky.social

Google’s New AI Tool: A Cool Concept but Not Ready for Showtime

Google Labs has launched a new experimental AI marketing tool called Pomelli, built to automate campaign creation while staying true to your brand's voice.

sdbart.co

November 7, 2025 at 2:30 PM

Scott B

@scottbartlett.bsky.social

Nvidia’s $5 Trillion Milestone

Chip maker Nvidia has officially become the world's first $5 trillion company.

sdbart.co

November 6, 2025 at 5:30 PM

Scott B

@scottbartlett.bsky.social

The Alarming Rise of Nudify Apps and the Inability to Stop Deepfakes

A disturbing new wave of AI applications, often called “nudify” apps, is triggering major concern as they spread rapidly across platforms such as Telegram and Discord.

sdbart.co

November 6, 2025 at 2:30 PM

Scott B

@scottbartlett.bsky.social

#Business #BusinessArtificialIntelligence The AI Data Center Boom Is Warping the US Economy

The AI Data Center Boom Is Warping the US Economy

Microsoft, Alphabet, Meta, and Amazon are investing tens of billions in data centers. AI infrastructure is now a key driver of US economic growth.

sdbart.co

November 6, 2025 at 1:45 PM

Scott B

@scottbartlett.bsky.social

#AI Forget Fine-Tuning: SAP’s RPT-1 Brings Ready-to-Use AI for Business Tasks

Forget Fine-Tuning: SAP’s RPT-1 Brings Ready-to-Use AI for Business Tasks

SAP aims to displace more general large language models with the release of its own foundational “tabular” model, which the company claims will reduce training requirements for enterprises. The model, called SAP RPT-1, is a pre-trained model with business and enterprise knowledge out of the box. SAP calls it a Relational Foundation Model, meaning it can do predictions based on relational databases even without fine-tuning or additional training. Walter Sun, SAP's global head of AI, told VentureBeat in an interview that the value of the new model lies in its ability to perform various enterprise tasks, such as predictive analytics, out of the box. “Everyone knows about language models, and there’s a bunch of good ones that already exist,” Sun said. “But we trained the model on data on business transactions, basically Excel spreadsheets, and so we have a model that can do predictive analytics where the value is that it’s out of the box, meaning you don’t need to have specifics of a company to do tasks analogous to a language model.” Sun said that right out of the gate, RPT-1 can essentially build out a business model for enterprises based on its knowledge gained from data from SAP’s decades of information. Organizations can plug the model directly into applications, even without additional fine-tuning. RPT-1, SAP’s first large family of AI models, will be generally available in “Q4 of 2025” and be deployed via SAP’s AI Foundation. While RPT-1 is currently available, the company stated that additional models will be made available soon, including an open-source, state-of-the-art model. SAP will also release a no-code playground environment to experiment with the model. Tabular models vs LLMs Tabular or relational AI models learned from spreadsheets, unlike LLMs, which learned from text and code. RPT-1 not only understands numbers and the relationships between different cells, but it’s also able to provide more structured and precise answers. When enterprises decide to use RPT-1, they can add more direction to the model through a bit of context engineering, since the model is semantically aware and learns based on how it is being used. SAP researchers first proposed the idea that tabular models can both exhibit semantic awareness and learn from content through a paper published in June. It proposed ConTextTab introduced context-aware pretraining. It utilizes semantic signals, such as table headers or column types, to guide model training, enabling the model to build a relational structure with the data. It’s this architecture that makes the model work best for tasks with precise answers, such as for financial or enterprise use cases. The RPT models build on the ConTextTab work that lets it learn structured business data, say from SAP’s knowledge graph, and then be able to add more context through usage. SAP researchers did test ConTextTab against benchmarks, saying it “is competitive” against similar models like TabPFN and TabIFL. Industry-specific models continue to grow Many enterprises prefer to fine-tune general LLMs like GPT-5 or Claude, to basically retrain the model to answer only questions relevant to their business. However, a shift towards industry-specific models has begun to take root. Sun said that his experience at a previous company, building a very narrow, highly customized AI model for sentiment analysis, influenced a lot of what makes RPT-1 different. “It was a very customized model, a narrow model that takes specific feedback for specific products but it wasn’t scalable,” Sun said. “When LLMs came about, that one model measures sentiment. But there are use cases that we can do that LLMs cannot do.” He said these use cases include predictions, such as determining when a shopper will return to a grocery store, which may involve numerical analysis along with an understanding of the shopper’s buying habits. However, some LLMs have begun integrating into spreadsheets, and AI model providers encourage users to upload similar data to teach them context. Microsoft added new capabilities to Copilot, including the ability to work in Excel. Anthropic integrated its Claude model with Excel, complementing its Claude for Finance service. Chinese startup Manus also offers a data visualization tool that understands spreadsheets, and ChatGPT can create charts from uploaded spreadsheets and other data sources. However, SAP noted that it is more than just reading a spreadsheet; RPT-1 should stand out amongst its competitors because it requires fewer additional pieces of information about a business to provide its responses.

sdbart.co

November 6, 2025 at 1:00 PM

Scott B

@scottbartlett.bsky.social

#Business #BusinessArtificialIntelligence Meet the Chinese Startup Using AI—and a Small Army of Workers—to Train Robots

Meet the Chinese Startup Using AI—and a Small Army of Workers—to Train Robots

AgiBot is using AI-powered robots to do new manufacturing tasks. Smarter machines may transform physical labor in China.

sdbart.co

November 5, 2025 at 7:45 PM

Scott B

@scottbartlett.bsky.social

New Benchmark Shows AI Agents Perform Poorly When Automating Real Jobs

A new paper from the Center for AI Safety and Scale AI has introduced the Remote Labor Index (RLI), the first benchmark designed to measure how well AI agents can perform paid, remote jobs.

sdbart.co

November 5, 2025 at 5:31 PM

Scott B

@scottbartlett.bsky.social

What Mercor’s $10B Valuation Could Mean for the Future of Work

A startup that connects AI labs with knowledge experts just quintupled its valuation to $10 billion after a massive $350 million Series C round.

sdbart.co

November 5, 2025 at 2:30 PM

Scott B

@scottbartlett.bsky.social

#AI 98% of market researchers use AI daily, but 4 in 10 say it makes errors — revealing a major trust problem

98% of market researchers use AI daily, but 4 in 10 say it makes errors — revealing a major trust problem

Market researchers have embraced artificial intelligence at a staggering pace, with 98% of professionals now incorporating AI tools into their work and 72% using them daily or more frequently, according to a new industry survey that reveals both the technology's transformative promise and its persistent reliability problems. The findings, based on responses from 219 U.S. market research and insights professionals surveyed in August 2025 by QuestDIY, a research platform owned by The Harris Poll, paint a picture of an industry caught between competing pressures: the demand to deliver faster business insights and the burden of validating everything AI produces to ensure accuracy. While more than half of researchers — 56% — report saving at least five hours per week using AI tools, nearly four in ten say they've experienced "increased reliance on technology that sometimes produces errors." An additional 37% report that AI has "introduced new risks around data quality or accuracy," and 31% say the technology has "led to more work re-checking or validating AI outputs." The disconnect between productivity gains and trustworthiness has created what amounts to a grand bargain in the research industry: professionals accept time savings and enhanced capabilities in exchange for constant vigilance over AI's mistakes, a dynamic that may fundamentally reshape how insights work gets done. How market researchers went from AI skeptics to daily users in less than a year The numbers suggest AI has moved from experiment to infrastructure in record time. Among those using AI daily, 39% deploy it once per day, while 33% use it "several times per day or more," according to the survey conducted between August 15-19, 2025. Adoption is accelerating: 80% of researchers say they're using AI more than they were six months ago, and 71% expect to increase usage over the next six months. Only 8% anticipate their usage will decline. “While AI provides excellent assistance and opportunities, human judgment will remain vital,” Erica Parker, Managing Director Research Products at The Harris Poll, told VentureBeat. “The future is a teamwork dynamic where AI will accelerate tasks and quickly unearth findings, while researchers will ensure quality and provide high level consultative insights.” The top use cases reflect AI's strength in handling data at scale: 58% of researchers use it for analyzing multiple data sources, 54% for analyzing structured data, 50% for automating insight reports, 49% for analyzing open-ended survey responses, and 48% for summarizing findings. These tasks—traditionally labor-intensive and time-consuming — now happen in minutes rather than hours. Beyond time savings, researchers report tangible quality improvements. Some 44% say AI improves accuracy, 43% report it helps surface insights they might otherwise have missed, 43% cite increased speed of insights delivery, and 39% say it sparks creativity. The overwhelming majority — 89% — say AI has made their work lives better, with 25% describing the improvement as "significant." The productivity paradox: saving time while creating new validation work Yet the same survey reveals deep unease about the technology's reliability. The list of concerns is extensive: 39% of researchers report increased reliance on error-prone technology, 37% cite new risks around data quality or accuracy, 31% describe additional validation work, 29% report uncertainty about job security, and 28% say AI has raised concerns about data privacy and ethics. The report notes that "accuracy is the biggest frustration with AI experienced by researchers when asked on an open-ended basis." One researcher captured the tension succinctly: "The faster we move with AI, the more we need to check if we're moving in the right direction." This paradox — saving time while simultaneously creating new work — reflects a fundamental characteristic of current AI systems, which can produce outputs that appear authoritative but contain what researchers call "hallucinations," or fabricated information presented as fact. The challenge is particularly acute in a profession where credibility depends on methodological rigor and where incorrect data can lead clients to make costly business decisions. "Researchers view AI as a junior analyst, capable of speed and breadth, but needing oversight and judgment," said Gary Topiol, Managing Director at QuestDIY, in the report. That metaphor — AI as junior analyst — captures the industry's current operating model. Researchers treat AI outputs as drafts requiring senior review rather than finished products, a workflow that provides guardrails but also underscores the technology's limitations. Why data privacy fears are the biggest obstacle to AI adoption in research When asked what would limit AI use at work, researchers identified data privacy and security concerns as the greatest barrier, cited by 33% of respondents. This concern isn't abstract: researchers handle sensitive customer data, proprietary business information, and personally identifiable information subject to regulations like GDPR and CCPA. Sharing that data with AI systems — particularly cloud-based large language models — raises legitimate questions about who controls the information and whether it might be used to train models accessible to competitors. Other significant barriers include time to experiment and learn new tools (32%), training (32%), integration challenges (28%), internal policy restrictions (25%), and cost (24%). An additional 31% cited lack of transparency in AI use as a concern, which could complicate explaining results to clients and stakeholders. The transparency issue is particularly thorny. When an AI system produces an analysis or insight, researchers often cannot trace how the system arrived at its conclusion — a problem that conflicts with the scientific method's emphasis on replicability and clear methodology. Some clients have responded by including no-AI clauses in their contracts, forcing researchers to either avoid the technology entirely or use it in ways that don't technically violate contractual terms but may blur ethical lines. "Onboarding beats feature bloat," Parker said in the report. "The biggest brakes are time to learn and train. Packaged workflows, templates, and guided setup all unlock usage faster than piling on capabilities." Inside the new workflow: treating AI like a junior analyst who needs constant supervision Despite these challenges, researchers aren't abandoning AI — they're developing frameworks to use it responsibly. The consensus model, according to the survey, is "human-led research supported by AI," where AI handles repetitive tasks like coding, data cleaning, and report generation while humans focus on interpretation, strategy, and business impact. About one-third of researchers (29%) describe their current workflow as "human-led with significant AI support," while 31% characterize it as "mostly human with some AI help." Looking ahead to 2030, 61% envision AI as a "decision-support partner" with expanded capabilities including generative features for drafting surveys and reports (56%), AI-driven synthetic data generation (53%), automation of core processes like project setup and coding (48%), predictive analytics (44%), and deeper cognitive insights (43%). The report describes an emerging division of labor where researchers become "Insight Advocates" — professionals who validate AI outputs, connect findings to stakeholder challenges, and translate machine-generated analysis into strategic narratives that drive business decisions. In this model, technical execution becomes less central to the researcher's value proposition than judgment, context, and storytelling. "AI can surface missed insights — but it still needs a human to judge what really matters," Topiol said in the report. What other knowledge workers can learn from the research industry's AI experiment The market research industry's AI adoption may presage similar patterns in other knowledge work professions where the technology promises to accelerate analysis and synthesis. The experience of researchers — early AI adopters who have integrated the technology into daily workflows — offers lessons about both opportunities and pitfalls. First, speed genuinely matters. One boutique agency research lead quoted in the report described watching survey results accumulate in real-time after fielding: "After submitting it for fielding, I literally watched the survey count climb and finish the same afternoon. It was a remarkable turnaround." That velocity enables researchers to respond to business questions within hours rather than weeks, making insights actionable while decisions are still being made rather than after the fact. Second, the productivity gains are real but uneven. Saving five hours per week represents meaningful efficiency for individual contributors, but those savings can disappear if spent validating AI outputs or correcting errors. The net benefit depends on the specific task, the quality of the AI tool, and the user's skill in prompting and reviewing the technology's work. Third, the skills required for research are changing. The report identifies future competencies including cultural fluency, strategic storytelling, ethical stewardship, and what it calls "inquisitive insight advocacy" — the ability to ask the right questions, validate AI outputs, and frame insights for maximum business impact. Technical execution, while still important, becomes less differentiating as AI handles more of the mechanical work. The strange phenomenon of using technology intensively while questioning its reliability The survey's most striking finding may be the persistence of trust issues despite widespread adoption. In most technology adoption curves, trust builds as users gain experience and tools mature. But with AI, researchers appear to be using tools intensively while simultaneously questioning their reliability — a dynamic driven by the technology's pattern of performing well most of the time but failing unpredictably. This creates a verification burden that has no obvious endpoint. Unlike traditional software bugs that can be identified and fixed, AI systems' probabilistic nature means they may produce different outputs for the same inputs, making it difficult to develop reliable quality assurance processes. The data privacy concerns — cited by 33% as the biggest barrier to adoption — reflect a different dimension of trust. Researchers worry not just about whether AI produces accurate outputs but also about what happens to the sensitive data they feed into these systems. QuestDIY's approach, according to the report, is to build AI directly into a research platform with ISO/IEC 27001 certification rather than requiring researchers to use general-purpose tools like ChatGPT that may store and learn from user inputs. "The center of gravity is analysis at scale — fusing multiple sources, handling both structured and unstructured data, and automating reporting," Topiol said in the report, describing where AI delivers the most value. The future of research work: elevation or endless verification? The report positions 2026 as an inflection point when AI moves from being a tool researchers use to something more like a team member — what the authors call a "co-analyst" that participates in the research process rather than merely accelerating specific tasks. This vision assumes continued improvement in AI capabilities, particularly in areas where researchers currently see the technology as underdeveloped. While 41% currently use AI for survey design, 37% for programming, and 30% for proposal creation, most researchers consider these appropriate use cases, suggesting significant room for growth once the tools become more reliable or the workflows more structured. The human-led model appears likely to persist. "The future is human-led, with AI as a trusted co-analyst," Parker said in the report. But what "human-led" means in practice may shift. If AI handles most analytical tasks and researchers focus on validation and strategic interpretation, the profession may come to resemble editorial work more than scientific analysis — curating and contextualizing machine-generated insights rather than producing them from scratch. "AI gives researchers the space to move up the value chain – from data gatherers to Insight Advocates, focused on maximising business impact," Topiol said in the report. Whether this transformation marks an elevation of the profession or a deskilling depends partly on how the technology evolves. If AI systems become more transparent and reliable, the verification burden may decrease and researchers can focus on higher-order thinking. If they remain opaque and error-prone, researchers may find themselves trapped in an endless cycle of checking work produced by tools they cannot fully trust or explain. The survey data suggests researchers are navigating this uncertainty by developing a form of professional muscle memory — learning which tasks AI handles well, where it tends to fail, and how much oversight each type of output requires. This tacit knowledge, accumulated through daily use and occasional failures, may become as important to the profession as statistical literacy or survey design principles. Yet the fundamental tension remains unresolved. Researchers are moving faster than ever, delivering insights in hours instead of weeks, and handling analytical tasks that would have been impossible without AI. But they're doing so while shouldering a new responsibility that previous generations never faced: serving as the quality control layer between powerful but unpredictable machines and business leaders making million-dollar decisions. The industry has made its bet. Now comes the harder part: proving that human judgment can keep pace with machine speed — and that the insights produced by this uneasy partnership are worth the trust clients place in them.

sdbart.co

November 5, 2025 at 1:00 PM

Scott B

@scottbartlett.bsky.social

#Business #BusinessArtificialIntelligence WIRED Roundup: Alpha School, Grokipedia, and Real Estate AI Videos

WIRED Roundup: Alpha School, Grokipedia, and Real Estate AI Videos

In this episode of Uncanny Valley, we run through the top stories of the week and dive into why the promise of a tech-forward school in Texas with software instead of teachers fell apart.

sdbart.co

November 4, 2025 at 7:45 PM

Scott B

@scottbartlett.bsky.social

How OpenAI’s Autonomous AI Researcher Could Reshape the Economy

OpenAI’s leadership just pulled back the curtain on one of its most consequential updates of the year.

sdbart.co

November 4, 2025 at 5:31 PM

Scott B

@scottbartlett.bsky.social

OpenAI Is Now a For-Profit Company, Paving the Way for a Possible $1 Trillion IPO

OpenAI has officially completed its long-anticipated restructuring, converting from a nonprofit into a traditional for-profit public benefit corporation.

sdbart.co

November 4, 2025 at 2:30 PM

Scott B

@scottbartlett.bsky.social

#Business #BusinessBigTech Kara Swisher Would Rather Work for Sam Altman Than Mark Zuckerberg

Kara Swisher Would Rather Work for Sam Altman Than Mark Zuckerberg

But ideally, the journalist and podcast host would rather not work for tech CEOs at all.

sdbart.co

November 4, 2025 at 1:45 PM

Scott B

@scottbartlett.bsky.social

#DataInfrastructure AI coding transforms data engineering: How dltHub's open-source Python library helps developers create data pipelines for AI in minutes

AI coding transforms data engineering: How dltHub's open-source Python library helps developers create data pipelines for AI in minutes

A quiet revolution is reshaping enterprise data engineering. Python developers are building production data pipelines in minutes using tools that would have required entire specialized teams just months ago. The catalyst is dlt, an open-source Python library that automates complex data engineering tasks. The tool has reached 3 million monthly downloads and powers data workflows for over 5,000 companies across regulated industries including finance, healthcare and manufacturing. That technology is getting another solid vote of confidence today as dltHub, the Berlin-based company behind the open-source dlt library, is raising $8 million in seed funding led by Bessemer Venture Partners. What makes this significant isn't just adoption numbers. It's how developers are using the tool in combination with AI coding assistants to accomplish tasks that previously required infrastructure engineers, DevOps specialists and on-call personnel. The company is building a cloud-hosted platform that extends their open-source library into a complete end-to-end solution. The platform will allow developers to deploy pipelines, transformations and notebooks with a single command without worrying about infrastructure. This represents a fundamental shift from data engineering requiring specialized teams to becoming accessible to any Python developer. "Any Python developer should be able to bring their business users closer to fresh, reliable data," Matthaus Krzykowski, dltHub's co-founder and CEO told VentureBeat in an exclusive interview. "Our mission is to make data engineering as accessible, collaborative and frictionless as writing Python itself." From SQL to Python-native data engineering The problem the company set out to solve emerged from real-world frustrations. One core set of frustrations comes from a fundamental clash between how different generations of developers work with data. Krzykowski noted that there is a generation of developers that are grounded in SQL and relational database technology. On the other hand is a generation of developers building AI agents with Python. This divide reflects deeper technical challenges. SQL-based data engineering locks teams into specific platforms and requires extensive infrastructure knowledge. Python developers working on AI need lightweight, platform-agnostic tools that work in notebooks and integrate with LLM coding assistants. The dlt library changes this equation by automating complex data engineering tasks in simple Python code. "If you know what a function in Python is, what a list is, a source and resource, then you can write this very declarative, very simple code," Krzykowski explained. The key technical breakthrough addresses schema evolution automatically. When data sources change their output format, traditional pipelines break. "DLT has mechanisms to automatically resolve these issues," Thierry Jean, founding engineer at dltHub told VentureBeat. "So it will push data, and you can say, alert me if things change upstream, or just make it flexible enough and change the data and the destination in a way to accommodate these things." Real-world developer experience Hoyt Emerson, Data Consultant and Content Creator at The Full Data Stack, recently adopted the tool for a job where he had a challenge to solve. He needed to move data from Google Cloud Storage to multiple destinations including Amazon S3 and a data warehouse. Traditional approaches would require platform-specific knowledge for each destination. Emerson told VentureBeat that what he really wanted was a much more lightweight, platform agnostic way to send data from one spot to another. "That's when DLT gave me the aha moment," Emerson said. He completed the entire pipeline in five minutes using the library's documentation which made it easy to get up and running quickly and without issue.. The process gets even more powerful when combined with AI coding assistants. Emerson noted that he's using agentic AI coding principles and realized that the dlt documentation could be sent as context to an LLM to accelerate and automate his data work. With the documentation as context, Emerson was able to create reusable templates for future projects and used AI assistants to generate deployment configurations. "It's extremely LLM friendly because it's very well documented," he said. The LLM-Native development pattern This combination of well-documented tools and AI assistance represents a new development pattern. The company has optimized specifically for what they call "YOLO mode" development where developers copy error messages and paste them into AI coding assistants. "A lot of these people are literally just copying and pasting error messages and are trying the code editors to figure it out," Krzykowski said. The company takes this behavior seriously enough that they fix issues specifically for AI-assisted workflows. The results speak to the approach's effectiveness. In September alone, users created over 50,000 custom connectors using the library. That represents a 20x increase since January, driven largely by LLM-assisted development. Technical architecture for enterprise scale The dlt design philosophy prioritizes interoperability over platform lock-in. The tool can deploy anywhere from AWS Lambda to existing enterprise data stacks. It integrates with platforms like Snowflake while maintaining the flexibility to work with any destination. "We always believe that DLT needs to be interoperable and modular," Krzykowski explained. "It can be deployed anywhere. It can be on Lambda. It often becomes part of other people's data infrastructures." Key technical capabilities include: * Automatic Schema Evolution: Handles upstream data changes without breaking pipelines or requiring manual intervention. * Incremental Loading: Processes only new or changed records, reducing computational overhead and costs. * Platform Agnostic Deployment: Works across cloud providers and on-premises infrastructure without modification. * LLM-Optimized Documentation: Structured specifically for AI assistant consumption, enabling rapid problem-solving and template generation. The platform currently supports over 4,600 REST API data sources with continuous expansion driven by user-generated connectors. Competing against ETL giants with a code-first approach The data engineering landscape splits into distinct camps, each serving different enterprise needs and developer preferences. Traditional ETL platforms like Informatica and Talend dominate enterprise environments with GUI-based tools that require specialized training but offer comprehensive governance features. Newer SaaS platforms like Fivetran have gained traction by emphasizing pre-built connectors and managed infrastructure, reducing operational overhead but creating vendor dependency. The open-source dlt library occupies a fundamentally different position as code-first, LLM-native infrastructure that developers can extend and customize. "We always believe that DLT needs to be interoperable and modular," Krzykowski explained. "It can be deployed anywhere. It can be on Lambda. It often becomes part of other people's data infrastructures." This positioning reflects the broader shift toward what the industry calls the composable data stack where enterprises build infrastructure from interoperable components rather than monolithic platforms. More importantly, the intersection with AI creates new market dynamics. "LLMs aren't replacing data engineers," Krzykowski said. "But they radically expand their reach and productivity." What this means for enterprise data leaders For enterprises looking to lead in AI-driven operations, this development represents an opportunity to fundamentally rethink data engineering strategies. The immediate tactical advantages are clear. Organizations can leverage existing Python developers instead of hiring specialized data engineering teams. Organizations that adapt their tooling and hiking approaches to leverage this trend may find significant cost and agility advantages over competitors still dependent on traditional, team-intensive data engineering. The question isn't whether this shift toward democratized data engineering will occur. It's how quickly enterprises adapt to capitalize on it.

sdbart.co

November 4, 2025 at 1:00 PM

Scott B

@scottbartlett.bsky.social

#Business #BusinessArtificialIntelligence OpenAI Signs $38 Billion Deal With Amazon

OpenAI Signs $38 Billion Deal With Amazon

OpenAI has committed to buying billions of dollars worth of compute from AWS—the latest in a string of major deals brokered by the AI startup.

sdbart.co

November 3, 2025 at 7:46 PM

Scott B

@scottbartlett.bsky.social

OpenAI’s Sora 2 Gets a Product Roadmap

OpenAI is pushing ahead with its new video generation app and model, Sora 2, detailing a feature-packed product roadmap that seems aimed at building the next massive social media feed.

sdbart.co

November 3, 2025 at 5:30 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news