Lightnews — Scholar-powered news

Private LLM

@privatellm.ai

Yes. LLM inference is memory-bound: memory capacity and memory bandwidth matter. For Macs, 64GB is a great sweet spot: you can run Llama 3.3 70B locally with GPT4o-level reasoning. Rule of thumb: run the largest model your Mac can fit.

October 29, 2025 at 4:31 PM

Private LLM

@privatellm.ai

Thanks for the shout-out! 🙌

Glad you’re enjoying Private LLM. The boost you’re seeing is because we’re not an MLX/llama.cpp wrapper like LM Studio or Ollama (slowllama?)

We quantize each model (OmniQuant/GPTQ) for Apple Silicon, so even low-RAM iPhones and Macs run fast and reason better.

October 29, 2025 at 3:47 PM

Private LLM

@privatellm.ai

Link to the paper: arxiv.org/abs/2508.03153

Estimating Worst-Case Frontier Risks of Open-Weight LLMs

In this paper, we study the worst-case frontier risks of releasing gpt-oss. We introduce malicious fine-tuning (MFT), where we attempt to elicit maximum capabilities by fine-tuning gpt-oss to be as ca...

arxiv.org

October 25, 2025 at 1:08 PM

Private LLM

@privatellm.ai

We are delighted to hear that. Please let us know if there’s any particular model you’d like to see in the app.

September 1, 2025 at 5:45 AM

Private LLM

@privatellm.ai

We just shipped an update. More coming soon

August 31, 2025 at 5:16 PM

Private LLM

@privatellm.ai

OpenHands LM – Coding focused language model based on Qwen 2.5 coder:

* 7B (iOS + macOS) – 8GB RAM or more
* 32B (macOS only) – 32GB RAM minimum

Handles bug fixing and code refactoring tasks. Trained on real GitHub issues via reinforcement learning.

April 23, 2025 at 8:01 PM

Private LLM

@privatellm.ai

Meta-Llama 3.1 8B SurviveV3 (3-bit iOS / 4-bit macOS)

Wilderness survival assistant, offline. Knows how to build shelters, find water, navigate terrain, etc.   

Runs on any iOS/Mac device with 8GB+ RAM — even off-grid.

April 23, 2025 at 8:01 PM

Private LLM

@privatellm.ai

Llama 3.1 8B UltraMedical (3-bit iOS / 4-bit macOS)

Biomedical assistant for med students, researchers, and clinicians. Answers board-exam style questions, explains research findings, and supports clinical reasoning — privately.

Runs on 8GB+ RAM.

April 23, 2025 at 8:01 PM

Private LLM

@privatellm.ai

Perplexity’s R1 1776 Distill Llama 70B 

Post-trained to eliminate refusal behavior on politically sensitive topics — while preserving full reasoning ability.

Built to refuse censorship: open dialogue, independent thought, and the right to answer freely.

macOS only. Needs 48GB+ RAM.

April 23, 2025 at 8:01 PM

Private LLM

@privatellm.ai

Amoral-Gemma3-1B-v2 & gemma-3-1b-it-abliterated

Uncensored 4-bit Omniquant quantized fine-tunes of Gemma 3 1B. For users who want unrestricted conversations, roleplay, and truth-seeking without moral filters. Fast and small. iOS and macOS.

April 23, 2025 at 8:01 PM

Private LLM

@privatellm.ai

Gemma 3 1B IT (4-bit QAT) Instruction-tuned.

Multilingual. Full 32K context on iPhones with ≥ 6GB RAM.  

Ideal for writing, Q&A, summarization — in 140+ languages.  

Small enough to run on any supported iOS or Mac device.

April 23, 2025 at 8:01 PM

Private LLM

@privatellm.ai

🛠️ We've fixed a pesky crash that was affecting some newer models on older versions of macOS like Sonoma.

February 17, 2025 at 10:18 PM

Private LLM

@privatellm.ai

👀 Also, we've updated our lineup by adding support for both 3-bit and 4-bit OmniQuant quantized versions of the EVA LLaMA 3.33 70B v0.1 model by @Nottlespike. Note that we've deprecated the previous version, EVA LLaMA 3.33 70B v0.0

February 17, 2025 at 10:18 PM

Private LLM

@privatellm.ai

For Apple Silicon Mac users with 64GB or more RAM, we still recommend using the 4-bit OmniQuant-quantized version of 70B models.

February 17, 2025 at 10:18 PM

Private LLM

@privatellm.ai

💪 Power users, rejoice! The 5 new 3-bit OmniQuant-quantized 70B models on Mac from Private LLM v1.9.8 are here. These models consume around 5GB less RAM than their 4-bit counterparts, making them ideal for Apple Silicon Macs with 48GB of RAM.

February 17, 2025 at 10:18 PM

Private LLM

@privatellm.ai

📏 Now, with Private LLM, you can see the context length right in the model quick switcher! This little upgrade makes a big difference, helping you choose the perfect model for your conversation or task at a glance.

February 17, 2025 at 10:18 PM

Private LLM

@privatellm.ai

✍️ Unleash your creativity with the Gemma 2 iFable 9B model from iFable! This top-tier creative writing model works on iPad Pros with 16GB of RAM or any Apple Silicon Mac with 16GB+ RAM. No other local LLM app lets you run 9B or 14B models on iOS like Private LLM can.

February 17, 2025 at 10:18 PM

Private LLM

@privatellm.ai

- Dolphin 3.0 Llama 3.1 8B - For iOS devices with 8GB or more RAM, like the iPhone 15 Pro or newer

These are currently the best uncensored LLMs that can fit in your pocket, no holds barred!

February 17, 2025 at 10:18 PM

Private LLM

@privatellm.ai

- Dolphin 3.0 Llama 3.2 3B - For those with 6GB+ RAM on their iOS devices or any Apple Silicon Mac
- Dolphin 3.0 Qwen 2.5 0.5B, 1.5B, 3B - Compatible with nearly all modern iPhones (iPhone 12 or newer) and Macs

February 17, 2025 at 10:18 PM

Private LLM

@privatellm.ai

🐬 Say hello to the uncensored freedom of Dolphin 3.0 models! From Cognitive Computations, these models are your ticket to unfiltered AI conversations.

- Dolphin 3.0 Llama 3.2 1B - Perfect for iPhones/iPads with 4GB+ RAM or any Apple Silicon Mac

February 17, 2025 at 10:18 PM

Private LLM

@privatellm.ai

Thank you, @soldaini.net! And huge congratulations on launching Ai2 OLMoE - love what you’re doing for local AI!

February 17, 2025 at 9:04 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news