Lightnews — Scholar-powered news

Mariusz Kurman

@mkurman.bsky.social

34 followers 30 following 16 posts

AI Tech Lead @ Kruk SA | CEO @ MedIT Solutions | MD | Medcases.io app creator

Posts Replies Media Videos

Mariusz Kurman

@mkurman.bsky.social

Here is my experimental Llama 3.2 3B with o1-like thinking. It utilizes Thoughts when needed, so don't be surprised when it's not.

Enjoy!

Give some likes to make me feel better 😂

huggingface.co/mkurman/llam...

mkurman/llama-3.2-MEDIT-3B-o1 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

January 4, 2025 at 1:17 PM

Mariusz Kurman

@mkurman.bsky.social

storm.genie.stanford.edu - A great tool from Stanford for creating articles. For me, a stronger Gemini with Deep Thinking. Definitely worth trying!

January 1, 2025 at 5:34 PM

Mariusz Kurman

@mkurman.bsky.social

Deepseek MTP is something you should definitely look at

December 28, 2024 at 2:35 PM

Mariusz Kurman

@mkurman.bsky.social

Predicting the next token as a learning objective is insufficient for optimal LLM training.

December 28, 2024 at 1:35 AM

Mariusz Kurman

@mkurman.bsky.social

HDIC - How Do I Contribute?

A new technique we are working on seems to have a huge impact on language models' generative capabilities, allowing the layers to self-esteem their contribution to the final prediction.

December 4, 2024 at 6:48 PM

Mariusz Kurman

@mkurman.bsky.social

RIP JetBrains subscription ☠️ after six years, it became too heavy to use as a daily IDE. I‘m now on the VS Code team.

December 4, 2024 at 10:04 AM

Mariusz Kurman

@mkurman.bsky.social

What research tools would you recommend for searching and analyzing scientific papers?

December 3, 2024 at 2:18 PM

Mariusz Kurman

@mkurman.bsky.social

We built a new small language model SmolLM2-MedIT-Upscale-2B, based on SmolLM2-1.7B-Instruct from Hugging Face. The premise was simple - increasing the vector in attention layers would positively impact the model's capabilities.

What did we prove? 1/4