luokai
@luok.ai
For more AI&Tech content, check here www.luok.ai
🍎Apple Die Hard Fan| 苹果骨灰粉
🤖GenAI Observer | GenAI观察者
👨🏻🎤Cutting Edge Tech Enthusiast | 科技爱好者
🍎Apple Die Hard Fan| 苹果骨灰粉
🤖GenAI Observer | GenAI观察者
👨🏻🎤Cutting Edge Tech Enthusiast | 科技爱好者
Built by Magenta, it maps 300 genres as stars in 3D using UMAP, then blends nearby prompts into realtime audio via Lyria RealTime. Fly, click, anchor—your path becomes a dynamic mashup. Deployed via AI Studio with Cloud Run proxying the Gemini key.
magenta.withgoogle.com/spacedj-anno...
magenta.withgoogle.com/spacedj-anno...
Space DJ: Navigating a Musical Universe
Today, we’re excited to launch Space DJ, a web application from Magenta thatturns music exploration into an interactive journey through a constellation ofsou...
magenta.withgoogle.com
November 11, 2025 at 3:03 PM
Built by Magenta, it maps 300 genres as stars in 3D using UMAP, then blends nearby prompts into realtime audio via Lyria RealTime. Fly, click, anchor—your path becomes a dynamic mashup. Deployed via AI Studio with Cloud Run proxying the Gemini key.
magenta.withgoogle.com/spacedj-anno...
magenta.withgoogle.com/spacedj-anno...
Customization: themes + full editability.
Pick from 9 themes or create your own. Edit text, fonts, layout; drag-and-drop elements; reorder slides. Audience styles: professional vs casual—switch per deck.
Github: github.com/allweonedev/...
Pick from 9 themes or create your own. Edit text, fonts, layout; drag-and-drop elements; reorder slides. Audience styles: professional vs casual—switch per deck.
Github: github.com/allweonedev/...
GitHub - allweonedev/presentation-ai: ALLWEONE® Open source AI presentation generator Gamma Alternative. Create professional slides with customizable themes and AI-generated content in minutes.
ALLWEONE® Open source AI presentation generator Gamma Alternative. Create professional slides with customizable themes and AI-generated content in minutes. - allweonedev/presentation-ai
github.com
November 11, 2025 at 2:38 PM
Customization: themes + full editability.
Pick from 9 themes or create your own. Edit text, fonts, layout; drag-and-drop elements; reorder slides. Audience styles: professional vs casual—switch per deck.
Github: github.com/allweonedev/...
Pick from 9 themes or create your own. Edit text, fonts, layout; drag-and-drop elements; reorder slides. Audience styles: professional vs casual—switch per deck.
Github: github.com/allweonedev/...
Local models via LM Studio/Ollama keep costs low and data local.
Core idea: AI-first deck generation.
Enter a topic, set slide count, language, and style. Generate an outline, tweak it, then auto-build slides with live rendering and autosave. Fast from prompt to presentable.
Core idea: AI-first deck generation.
Enter a topic, set slide count, language, and style. Generate an outline, tweak it, then auto-build slides with live rendering and autosave. Fast from prompt to presentable.
November 11, 2025 at 2:38 PM
Local models via LM Studio/Ollama keep costs low and data local.
Core idea: AI-first deck generation.
Enter a topic, set slide count, language, and style. Generate an outline, tweak it, then auto-build slides with live rendering and autosave. Fast from prompt to presentable.
Core idea: AI-first deck generation.
Enter a topic, set slide count, language, and style. Generate an outline, tweak it, then auto-build slides with live rendering and autosave. Fast from prompt to presentable.
Open-source play.
Positioned as an open alternative to commercial ASR suites—bring your own stack, inspect weights, adapt to niche languages, avoid lock-in.
Positioned as an open alternative to commercial ASR suites—bring your own stack, inspect weights, adapt to niche languages, avoid lock-in.
November 11, 2025 at 2:35 PM
Open-source play.
Positioned as an open alternative to commercial ASR suites—bring your own stack, inspect weights, adapt to niche languages, avoid lock-in.
Positioned as an open alternative to commercial ASR suites—bring your own stack, inspect weights, adapt to niche languages, avoid lock-in.
Mixed-language + dialects.
Design bias toward code-switching and dialect variance; expect better robustness vs monolingual-only baselines. Great for global media workflows.
Design bias toward code-switching and dialect variance; expect better robustness vs monolingual-only baselines. Great for global media workflows.
November 11, 2025 at 2:35 PM
Mixed-language + dialects.
Design bias toward code-switching and dialect variance; expect better robustness vs monolingual-only baselines. Great for global media workflows.
Design bias toward code-switching and dialect variance; expect better robustness vs monolingual-only baselines. Great for global media workflows.
Low-data onboarding.
New languages can be added with just a few paired examples—bootstrapping via shared multilingual representations and cross-lingual transfer.
New languages can be added with just a few paired examples—bootstrapping via shared multilingual representations and cross-lingual transfer.
November 11, 2025 at 2:35 PM
Low-data onboarding.
New languages can be added with just a few paired examples—bootstrapping via shared multilingual representations and cross-lingual transfer.
New languages can be added with just a few paired examples—bootstrapping via shared multilingual representations and cross-lingual transfer.
Backbone: Omnilingual w2v 2.0 (7B).
A multilingual speech representation you can fine-tune for ASR or repurpose for tasks like diarization, keyword spotting, or alignment.
A multilingual speech representation you can fine-tune for ASR or repurpose for tasks like diarization, keyword spotting, or alignment.
November 11, 2025 at 2:35 PM
Backbone: Omnilingual w2v 2.0 (7B).
A multilingual speech representation you can fine-tune for ASR or repurpose for tasks like diarization, keyword spotting, or alignment.
A multilingual speech representation you can fine-tune for ASR or repurpose for tasks like diarization, keyword spotting, or alignment.
Model family.
Multiple ASR sizes (300M→7B) let you trade off speed vs accuracy. Lightweight models for edge or real-time; larger ones for studio-grade transcription.
Multiple ASR sizes (300M→7B) let you trade off speed vs accuracy. Lightweight models for edge or real-time; larger ones for studio-grade transcription.
November 11, 2025 at 2:35 PM
Model family.
Multiple ASR sizes (300M→7B) let you trade off speed vs accuracy. Lightweight models for edge or real-time; larger ones for studio-grade transcription.
Multiple ASR sizes (300M→7B) let you trade off speed vs accuracy. Lightweight models for edge or real-time; larger ones for studio-grade transcription.
Core idea: universal coverage.
Most ASR focuses on well-represented languages; Omnilingual targets long-tail speech too—dialects, low-resource data, mixed-language clips—so creators and communities aren’t left out.
Github: github.com/facebookrese...
Most ASR focuses on well-represented languages; Omnilingual targets long-tail speech too—dialects, low-resource data, mixed-language clips—so creators and communities aren’t left out.
Github: github.com/facebookrese...
GitHub - facebookresearch/omnilingual-asr: Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages - facebookresearch/omnilingual-asr
github.com
November 11, 2025 at 2:35 PM
Core idea: universal coverage.
Most ASR focuses on well-represented languages; Omnilingual targets long-tail speech too—dialects, low-resource data, mixed-language clips—so creators and communities aren’t left out.
Github: github.com/facebookrese...
Most ASR focuses on well-represented languages; Omnilingual targets long-tail speech too—dialects, low-resource data, mixed-language clips—so creators and communities aren’t left out.
Github: github.com/facebookrese...
It’s a suite of models spanning 300M–7B parameters that transcribe speech in 1,600+ languages, including 500 that no ASR has reached before. Think internet-scale coverage, low-resource inclusivity, and modular components you can remix.
Blog: ai.meta.com/blog/omnilin...
Blog: ai.meta.com/blog/omnilin...
Omnilingual ASR: Advancing Automatic Speech Recognition for 1,600+ Languages
We’re introducing Meta Omnilingual Automatic Speech Recognition, a suite of models providing automatic speech recognition capabilities for over 1,600 languages.
ai.meta.com
November 11, 2025 at 2:35 PM
It’s a suite of models spanning 300M–7B parameters that transcribe speech in 1,600+ languages, including 500 that no ASR has reached before. Think internet-scale coverage, low-resource inclusivity, and modular components you can remix.
Blog: ai.meta.com/blog/omnilin...
Blog: ai.meta.com/blog/omnilin...
If engagement-growth feedback drives chaos, what architectural change would you try first—caps on virality, decentralizing influence, or friction for hot takes?
Paper: arxiv.org/abs/2508.03385
Paper: arxiv.org/abs/2508.03385
Can We Fix Social Media? Testing Prosocial Interventions using Generative Social Simulation
Social media platforms have been widely linked to societal harms, including rising polarization and the erosion of constructive debate. Can these problems be mitigated through prosocial interventions?...
arxiv.org
November 11, 2025 at 2:32 PM
If engagement-growth feedback drives chaos, what architectural change would you try first—caps on virality, decentralizing influence, or friction for hot takes?
Paper: arxiv.org/abs/2508.03385
Paper: arxiv.org/abs/2508.03385
Even a minimal, synthetic platform reproduces three dysfunctions: partisan echo chambers, elite concentration, and amplified polarized voices. Results: most interventions only modestly improve—or even worsen—outcomes.
November 11, 2025 at 2:32 PM
Even a minimal, synthetic platform reproduces three dysfunctions: partisan echo chambers, elite concentration, and amplified polarized voices. Results: most interventions only modestly improve—or even worsen—outcomes.
Even a minimal, synthetic platform reproduces three dysfunctions: partisan echo chambers, elite concentration, and amplified polarized voices. Results: most interventions only modestly improve—or even worsen—outcomes.
November 11, 2025 at 2:31 PM
Even a minimal, synthetic platform reproduces three dysfunctions: partisan echo chambers, elite concentration, and amplified polarized voices. Results: most interventions only modestly improve—or even worsen—outcomes.
Available on desktop for Gemini Pro and Ultra, with mobile rolling out in days. Privacy chatter is heating up, but the promise is fewer tabs, more answers.
Key stats
Sources: Gmail, Drive, Chat. Platforms: Desktop now, mobile soon. Audience: Pro & Ultra users.
Key stats
Sources: Gmail, Drive, Chat. Platforms: Desktop now, mobile soon. Audience: Pro & Ultra users.
November 6, 2025 at 3:12 AM
Available on desktop for Gemini Pro and Ultra, with mobile rolling out in days. Privacy chatter is heating up, but the promise is fewer tabs, more answers.
Key stats
Sources: Gmail, Drive, Chat. Platforms: Desktop now, mobile soon. Audience: Pro & Ultra users.
Key stats
Sources: Gmail, Drive, Chat. Platforms: Desktop now, mobile soon. Audience: Pro & Ultra users.
Core: Sound & polish.
Audio likely blends Foley, libraries, and mix/master; AI assists with scratch VO, temp SFX, or cleanup—finals handled by audio pros.
Audio likely blends Foley, libraries, and mix/master; AI assists with scratch VO, temp SFX, or cleanup—finals handled by audio pros.
November 6, 2025 at 1:52 AM
Core: Sound & polish.
Audio likely blends Foley, libraries, and mix/master; AI assists with scratch VO, temp SFX, or cleanup—finals handled by audio pros.
Audio likely blends Foley, libraries, and mix/master; AI assists with scratch VO, temp SFX, or cleanup—finals handled by audio pros.
Core: Iteration speed.
AI accelerates boards, animatics, and style frames. Directors explore more options early, then lock decisions before heavy shots start.
AI accelerates boards, animatics, and style frames. Directors explore more options early, then lock decisions before heavy shots start.
November 6, 2025 at 1:52 AM
Core: Iteration speed.
AI accelerates boards, animatics, and style frames. Directors explore more options early, then lock decisions before heavy shots start.
AI accelerates boards, animatics, and style frames. Directors explore more options early, then lock decisions before heavy shots start.
Core: Role clarity.
Crediting “AI artists” formalizes responsibilities: prompt craft, reference curation, style control, and handoff to CG—accountability equals quality.
Crediting “AI artists” formalizes responsibilities: prompt craft, reference curation, style control, and handoff to CG—accountability equals quality.
November 6, 2025 at 1:52 AM
Core: Role clarity.
Crediting “AI artists” formalizes responsibilities: prompt craft, reference curation, style control, and handoff to CG—accountability equals quality.
Crediting “AI artists” formalizes responsibilities: prompt craft, reference curation, style control, and handoff to CG—accountability equals quality.
Core: Hybrid pipeline.
Traditional modeling, lighting, animation, and compositing stay central. AI slots in for concepting, look dev, texture variants, and quick alt passes—speed without sacrificing control.
Traditional modeling, lighting, animation, and compositing stay central. AI slots in for concepting, look dev, texture variants, and quick alt passes—speed without sacrificing control.
November 6, 2025 at 1:52 AM
Core: Hybrid pipeline.
Traditional modeling, lighting, animation, and compositing stay central. AI slots in for concepting, look dev, texture variants, and quick alt passes—speed without sacrificing control.
Traditional modeling, lighting, animation, and compositing stay central. AI slots in for concepting, look dev, texture variants, and quick alt passes—speed without sacrificing control.