Lightnews — Scholar-powered news

Taras Trishchuk

@trishchuk.bsky.social

airdrop.tari.com/download/9wa...

airdrop.tari.com

May 8, 2025 at 2:42 PM

Taras Trishchuk

@trishchuk.bsky.social

arxiv.org/pdf/2501.04519

arxiv.org

January 10, 2025 at 5:13 PM

Taras Trishchuk

@trishchuk.bsky.social

Results:

Improved Qwen2.5-Math-7B accuracy from 58.8% to 90.0% on MATH dataset
The system solved 53.3% of AIME test problems (top 20% among participants)
Outperformed larger models on several key datasets

January 10, 2025 at 5:12 PM

Taras Trishchuk

@trishchuk.bsky.social

Use of Process Preference Model (PPM) to evaluate and improve reasoning steps
Created a dataset of 747k math problems with verified solutions

January 10, 2025 at 5:12 PM

Taras Trishchuk

@trishchuk.bsky.social

MCTS simulates step-by-step reasoning paths, mimicking deep human thinking
Each step includes a natural language explanation and Python code for validation
Self-evolution process through four rounds of mutual model improvement

January 10, 2025 at 5:11 PM

Taras Trishchuk

@trishchuk.bsky.social

rStar-Math is a Microsoft technique that enhances the mathematical abilities of small language models using Monte Carlo Tree Search (MCTS) and self-evolution strategies.

Key features:

January 10, 2025 at 5:11 PM

Taras Trishchuk

@trishchuk.bsky.social

arxiv.org/pdf/2412.156...

arxiv.org

January 10, 2025 at 2:30 PM

Taras Trishchuk

@trishchuk.bsky.social

To enable for all models, use

$ OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q4_0 ollama serve

January 10, 2025 at 1:59 PM

Taras Trishchuk

@trishchuk.bsky.social

Ollama 0.5.0

Added Experimental: new flag to set KV cache quantization to 4-bit, 8-bit, or 16-bit. This reduces VRAM requirements for longer context windows

*Note: in the future flash attention will be enabled by default where available, with KV cache quantization available on a per-model basis

January 10, 2025 at 1:57 PM

Taras Trishchuk

@trishchuk.bsky.social

However, the approach has two significant limitations:

- Difficulty handling dynamic data
- Dependence on the model's maximum context length

January 10, 2025 at 1:31 PM

Taras Trishchuk

@trishchuk.bsky.social

Main advantages:

- Instant generation without document search delays
- Reduced errors thanks to pre-computed KV-cache
- Simplified architecture without a separate search component
- Faster query processing
- Improved accuracy through unified, complete context

January 10, 2025 at 1:31 PM

Taras Trishchuk

@trishchuk.bsky.social

The results are astonishing: the chip completed a task in just 5 minutes that would take its “competitor” longer than the lifetime of the universe.

Google plans to use Willow to train neural networks as well.

Welcome to the future.

December 10, 2024 at 11:36 AM

Taras Trishchuk

@trishchuk.bsky.social

⚡️ this quantum chip features a record-breaking 105 qubits.

The breakthrough comes with the Willow chip, a powerful quantum processor that reduces error rates as it scales—a challenge scientists have struggled with for three decades.

December 10, 2024 at 11:36 AM

Taras Trishchuk

@trishchuk.bsky.social

2. Willow performed a standard benchmark computation in under five minutes that would take one of today’s fastest supercomputers 10 septillion (that is, 10^25) years — a number that vastly exceeds the age of the Universe.

December 9, 2024 at 10:45 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news