Zack Angelo
zackangelo.bsky.social
Zack Angelo
@zackangelo.bsky.social
building ai inference @ mixlayer
one of the most slept on capabilities of newer AI models is the ability to call multiple tools in a single shot. here's the newest llama 70b running on mixlayer calling 4 tools (lookup weather in 3 cities and perform some arithmetic)
December 13, 2024 at 8:24 PM
Want to play around with chain of thought and some other prompting techniques? I put up a few
Mixlayer demos on Meta's Llama 3.1 8b in this blog post. www.mixlayer.com/blog/2024-12...
LLM Reasoning 101 - Mixlayer
Large Language Models (LLMs) can be made better at complex reasoning tasks through techniques like few-shot prompting and Chain of Thought (CoT) reasoning, which allow smaller models to match the perf...
www.mixlayer.com
December 11, 2024 at 4:53 PM
Crazy to think that a 1M token context window will be the norm soon.

Doesn't look like this model has made it onto HF yet (just a space, no weights), curious to learn more about the sparse attention mechanism.

qwenlm.github.io/blog/qwen2.5...
Extending the Context Length to 1M Tokens!
API Documentation (Chinese) HuggingFace Demo ModelScope Demo Introduction After the release of Qwen2.5, we heard the community’s demand for processing longer contexts. In recent months, we have made m...
qwenlm.github.io
November 18, 2024 at 3:45 PM
woke up in a 3am fit of terror last night bc I dreamt I left an 8x a100 gpu cluster running by accident 🫠
November 17, 2024 at 1:58 PM