Stefan Baack
banner
sbaack.com
Stefan Baack
@sbaack.com
Senior researcher studying data governance and AI training data. Mastodon: @tootbaack@infosec.exchange he/him
Reposted by Stefan Baack
“AI is fake and sucks” vs “AI is real and dangerous” is a Twitter argument. In reality I think the debate also has a lot of “AI is real but not for how you’re using it,” to “AI is fake and that is dangerous,” to “things are happening to real people because of AI hype and that should stop.”
December 6, 2024 at 7:29 AM
It ended well though. He got the job, and still has it. We met recently 😅
February 21, 2024 at 9:48 PM
I still remember when a friend asked for advice about getting a job I intended to apply for
February 21, 2024 at 9:07 AM
Long term, there should be less reliance on sources like Common Crawl and a bigger emphasis on training generative AI on datasets created and curated by people in equitable and transparent ways (10/10)
February 6, 2024 at 4:03 PM
A key issue is that filtered Common Crawl versions are not updated after their original publication to take feedback and criticism into account. Therefore, we need dedicated intermediaries tasked with filtering Common Crawl in transparent and accountable ways that are continuously updated (9/10)
February 6, 2024 at 4:03 PM
AI builders should put more effort into filtering Common Crawl, establish industry standards and best practices for end-user products to reduce potential harms when using Common Crawl or similar sources for training data (8/10)
February 6, 2024 at 4:03 PM
Both Common Crawl and AI builders can help making generative AI less harmful. Common Crawl should highlight the limitations and biases of its data, be more transparent and inclusive about its governance, and enforce more transparency by requiring AI builders to attribute using Common Crawl (7/10)
February 6, 2024 at 4:03 PM