Lightnews — Scholar-powered news

ethicalabs.ai

@ethicalabs.bsky.social

Forget counting how many "r"s are in "raspberry": we're enabling federated fine-tuning of Small Language Models (135M, 270M) on a #RaspberryPi with flower.ai 🍓 github.com/ethicalabs-a... #MLSky #LLM #FederatedLearning

GitHub - ethicalabs-ai/BlossomTuneLLM: Federated Supervised Fine-Tuning for Small Language Models (SLMs)

Federated Supervised Fine-Tuning for Small Language Models (SLMs) - ethicalabs-ai/BlossomTuneLLM

github.com

August 24, 2025 at 11:09 AM

ethicalabs.ai

@ethicalabs.bsky.social

"Obesity Risk Predictor" is tool designed to help identifying health risks based on lifestyle habits. The app lets you compare the performance of 3 different models (Random Forest, #LightGBM, and #XGBoost) on the same dataset

huggingface.co/spaces/ethic...

#MachineLearning #DataScience #MLSky

ObesityRiskPredictor - a Hugging Face Space by ethicalabs

Classification with Random Forest, LightGBM and XGBoost.

huggingface.co

August 23, 2025 at 1:58 PM

ethicalabs.ai

@ethicalabs.bsky.social

This piece argues that the true value of AI/ML isn't in creating AGI or replacing humans, but in building tools that empower us: medium.com/@massimo.sca...

By amplifying human potential, technology can drive collective flourishing 🌏

#AI #FutureOfWork #EthicalAI #Innovation #Technology

Empowered, Not Replaced: The Only Sustainable Path for Technology

The rise of AI isn’t about replacing humanity; it’s about amplifying it

medium.com

August 16, 2025 at 11:05 AM

Reposted by ethicalabs.ai

Shahab Bakhtiari

@shahabbakht.bsky.social

This paper is making the rounds: arxiv.org/abs/2506.21734

A tiny (27M) brain-inspired model trained just on 1000 samples outperforming o3-mini-high on reasoning tasks.

#MLSky 🧠🤖

August 3, 2025 at 2:01 AM

ethicalabs.ai

@ethicalabs.bsky.social

Introducing Completionist, an open-source command-line tool that automates synthetic text dataset generation.

👉 Check out Completionist on #GitHub: github.com/ethicalabs-a...

#LLMs #GenerativeAI #DataEngineering #FineTuning #OpenSource #Python #SyntheticData #RAG

GitHub - ethicalabs-ai/completionist: Command-line tools for Synthetic Datasets Generation

Command-line tools for Synthetic Datasets Generation - ethicalabs-ai/completionist

github.com

August 2, 2025 at 12:58 PM

ethicalabs.ai

@ethicalabs.bsky.social

Building and Sharing a Multimodal ViT Model for Skin Lesion Analysis: From Proof of Concept to Hugging Face App 🤗 #huggingface #MLSky #opensource hashtag#python #vit #transformers #medicalai #visionmodel #skincancer

medium.com/@massimo.sca...

Building and Sharing a Multimodal ViT Model for Skin Lesion Analysis

From Proof of Concept to Huggingface App 🤗

medium.com

August 2, 2025 at 12:36 PM

ethicalabs.ai

@ethicalabs.bsky.social

Rather than chasing benchmark supremacy or scaling wars, Kurtis E1.1 focuses on understanding, sustainability, and practical impact, especially in areas like mental health support and safer human-AI interaction

huggingface.co/blog/mrs83/k...

#MLSky #EthicalAI #LLM

Kurtis-E1.1: Supervised Fine-tuning of Qwen2.5-3B-Instruct with Flower.ai & Hugging Face

A Blog post by Massimo Roberto Scamarcia on Hugging Face

huggingface.co

April 2, 2025 at 5:49 PM

ethicalabs.ai

@ethicalabs.bsky.social

To developers: Build opt-in systems.
To policymakers: Legislate data transparency.
To artists: Unionize.
To users: Demand ethical tools.

#EthicalAI #MLSky

March 30, 2025 at 6:28 PM

ethicalabs.ai

@ethicalabs.bsky.social

Just built an offline voice assistant for macOS:
🎤 Whisper STT (MLX)
🧠 LLM via #Ollama
🗣️ XTTSv2 TTS
🌍 Optional translation
No cloud. No tracking. No vibe coding — all handcrafted.
Demo here 🎥 www.youtube.com/watch?v=8-1P...

#OnDeviceAI #LLM #Privacy #TTS #STT

Kurtis-E1-MLX-Voice-Agent

YouTube video by Massimo Scamarcia

www.youtube.com

March 24, 2025 at 11:52 PM

ethicalabs.ai

@ethicalabs.bsky.social

Testing Kurtis E1 beyond its training scope—AI ethics, decentralization, even philosophy. No hallucinations, just structured reasoning. Is this emergent? You decide.

Read more: medium.com/@massimo.sca... #LLM #AI #EthicalAI

Testing Kurtis E1

Can a Small, Fine-Tuned Model Generalize Beyond Its Training Scope?

medium.com

March 5, 2025 at 11:39 PM

Reposted by ethicalabs.ai

Scott McGrath

@smcgrath.phd

🧪 The Arc Institute's Evo2 models DNA like an LLM models language, predicting mutations, gene function, and evolutionary signals. With 40B parameters trained on 128K genomes, it hints at AI-driven biological discovery. 🧬💻 #MLSky
Link to the paper: https://arcinstitute.org/manuscripts/Evo2

AI can now model and design the genetic code for all domains of life with Evo 2 | Arc Institute

Arc Institute develops the largest AI model for biology to date in collaboration with NVIDIA, bringing together Stanford University, UC Berkeley, and UC San Francisco researchers

arcinstitute.org

February 24, 2025 at 3:15 PM

ethicalabs.ai

@ethicalabs.bsky.social

🌀 Ouroboros: Small models drive recursive #LLM self-refinement for synthetic datasets generation. On-device AI shaping smarter futures! #EdgeAI #OpenSourceWeek #DeepSeekR1 #Ollama medium.com/@massimo.sca...

🌀 Ouroboros: Recursive LLM Refinement with Small Models

From Synthetic Data to Recursive Self-Refinement: An On-Device Experiment

medium.com

February 22, 2025 at 9:37 PM

ethicalabs.ai

@ethicalabs.bsky.social

🚀 NVIDIA Minitron: Efficient LLM Compression!

arxiv.org/pdf/2408.11796

Minitron uses pruning + distillation to create smaller, high-performance models

🔑 Highlights:
- Teacher Correction: Adapts models to new data
- Structured #Pruning: 2.7x faster inference
- #Distillation: Uses 40x fewer tokens

arxiv.org

February 19, 2025 at 9:07 PM

Reposted by ethicalabs.ai

AI & ML News

@ai-news.at.thenote.app

OpenAI scrubs diversity commitment web page from its site

OpenAI has eliminated a page on its website that used to express its commitment to diversity, equity, and inclusion. The URL “https://openai.com/commitment-to-dei/” now redirects to “https://openai.com/building-dynamic-te…

#ai #news #openai

OpenAI scrubs diversity commitment web page from its site

OpenAI has eliminated a page on its website that used to express its commitment to diversity, equity, and inclusion. The URL “https://openai.com/commitment-to-dei/” now redirects to “https://openai.com/building-dynamic-teams/,” a page that talks about people with “different backgrounds” with no use of the word “diversity.” The previous page stated that the company’s “investment in diversity, equity and inclusion” […]

techcrunch.com

February 14, 2025 at 7:42 PM

ethicalabs.ai

@ethicalabs.bsky.social

Arcee AI and AngelQ just launched KidRails for hashtag #LLMs — an open-source framework for safe, age-appropriate #AI responses for children

hypepotamus.com/startup-news...

Setting new standards in security, transparency, and responsibility 🌸 #EthicalAI #ML

This Nashville Startup Is Protecting Kids Online with Smarter, Safer AI - Hypepotamus

Nashville-based Angel Q (previously known as Angel Kids AI), got its start building a safer browser option for kids to access the internet. But browsers are not the only way of searching for […]

hypepotamus.com

February 14, 2025 at 7:44 PM

ethicalabs.ai

@ethicalabs.bsky.social

Pleias is a large language model trained exclusively on open data. It was developed using the Common Corpus, a dataset that addresses the need for high-quality compliant training data in AI development. huggingface.co/blog/Pclangl...

#opensourcellm #opendata #commoncorpus #llm #ai #ml

They Said It Couldn’t Be Done

A Blog post by Pierre-Carl Langlais on Hugging Face

huggingface.co

February 14, 2025 at 7:39 PM

Reposted by ethicalabs.ai

Machine Learning

@machinelearning.bsky.social

A naive way to generate synthetic fine-tuning data is to feed prompts to a model, collect its output, and use that as the fine-tuning set. Synthetic data is cheap, so we can afford to be more choosy. By generating responses to each prompt, we can select the one that best suits our purposes. #AI #ML

Active Inheritance, A Smarter Way to Train Models with Synthetic Data

The practice of fine-tuning models on synthetic data is becoming well established. But synthetic training data, even if it represents the training...

www.deeplearning.ai

February 6, 2025 at 12:53 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news