Lightnews — Scholar-powered news

Winbuzzer

@winbuzzer.com

DeepSeek Releases Experimental V3.2 AI Model with ‘Sparse Attention’ to Boost Efficiency

#AI #DeepSeek #ChinaAI #OpenSource #LLM #TechWar #SparseAttention #China

winbuzzer.com/2025/09/29/d...

DeepSeek Releases Experimental V3.2 AI Model with ‘Sparse Attention’ to Boost Efficiency - WinBuzzer

Chinese AI lab DeepSeek has released DeepSeek-V3.2-Exp, an experimental open-source model testing a new, more efficient sparse attention mechanism.

winbuzzer.com

September 29, 2025 at 1:53 PM

Ars Technica News

@arstechni.ca

DeepSeek tests “sparse attention” to slash AI processing costs https://arstechni.ca... #computationalefficiency #transformerarchitecture #long-contextprocessing #AIdevelopmenttools #AIinfrastructure #machinelearning #sparseattention #AIefficiency #AIresearch #opensource #ChineseAI #deepseek…

September 30, 2025 at 9:02 PM

GetNews.me

@getnews-me.bsky.social

FG‑Attn applies fine‑grained sparse attention, yielding an average 1.55× speedup on a single NVIDIA H100 GPU for five‑second 480p clips (up to 1.65×). Read more: https://getnews.me/fg-attn-brings-fine-grained-sparse-attention-to-speed-video-diffusion/ #fgattn #sparseattention

FG‑Attn brings fine‑grained sparse attention to speed video diffusion

September 24, 2025 at 8:07 AM

GetNews.me

@getnews-me.bsky.social

DeepSeek released the V3.2‑exp model on Monday, using sparse attention to cut API costs for long‑context tasks by half. The model and weights are open‑source on Hugging Face. Read more: https://getnews.me/deepseek-unveils-sparse-attention-model-to-halve-api-costs/ #deepseek #sparseattention

DeepSeek unveils sparse attention model to halve API costs

October 2, 2025 at 6:23 PM

Sung Kim

@sungkim.bsky.social

FlashInfer v0.2 by @yzh119.bsky.social

FlashInfer is a library and kernel generator for Large Language Models that provides high-performance implementation of LLM GPU kernels such as FlashAttention, SparseAttention, PageAttention, Sampling, and more.

December 19, 2024 at 9:54 PM

GetNews.me

@getnews-me.bsky.social

ProxyAttn, a training‑free method without additional training using representative heads, claims up to 10.3× faster raw attention and 2.4× speed‑up in LLM pre‑fill. Read more: https://getnews.me/proxyattn-introduces-guided-sparse-attention-using-representative-heads/ #proxyattn #sparseattention

ProxyAttn Introduces Guided Sparse Attention Using Representative Heads

September 30, 2025 at 11:38 PM

The Daily Tech Feed

@thedailytechfeed.com

DeepSeek unveils V3.2-exp model with sparse attention, slashing AI inference costs by up to 50%. A game-changer for long-context operations! #AI #DeepSeek #SparseAttention #TechInnovation Link: thedailytechfeed.com/deepseeks-v3...