ethicalabs.ai
ethicalabs.bsky.social
ethicalabs.ai
@ethicalabs.bsky.social
Building practical, ethical and sustainable AI/ML solutions https://www.ethicalabs.ai/
Forget counting how many "r"s are in "raspberry": we're enabling federated fine-tuning of Small Language Models (135M, 270M) on a #RaspberryPi with flower.ai 🍓 github.com/ethicalabs-a... #MLSky #LLM #FederatedLearning
GitHub - ethicalabs-ai/BlossomTuneLLM: Federated Supervised Fine-Tuning for Small Language Models (SLMs)
Federated Supervised Fine-Tuning for Small Language Models (SLMs) - ethicalabs-ai/BlossomTuneLLM
github.com
August 24, 2025 at 11:09 AM
"Obesity Risk Predictor" is tool designed to help identifying health risks based on lifestyle habits. The app lets you compare the performance of 3 different models (Random Forest, #LightGBM, and #XGBoost) on the same dataset

huggingface.co/spaces/ethic...

#MachineLearning #DataScience #MLSky
ObesityRiskPredictor - a Hugging Face Space by ethicalabs
Classification with Random Forest, LightGBM and XGBoost.
huggingface.co
August 23, 2025 at 1:58 PM
This piece argues that the true value of AI/ML isn't in creating AGI or replacing humans, but in building tools that empower us: medium.com/@massimo.sca...

By amplifying human potential, technology can drive collective flourishing 🌏

#AI #FutureOfWork #EthicalAI #Innovation #Technology
Empowered, Not Replaced: The Only Sustainable Path for Technology
The rise of AI isn’t about replacing humanity; it’s about amplifying it
medium.com
August 16, 2025 at 11:05 AM
Reposted by ethicalabs.ai
This paper is making the rounds: arxiv.org/abs/2506.21734

A tiny (27M) brain-inspired model trained just on 1000 samples outperforming o3-mini-high on reasoning tasks.

#MLSky 🧠🤖
August 3, 2025 at 2:01 AM
Introducing Completionist, an open-source command-line tool that automates synthetic text dataset generation.

👉 Check out Completionist on #GitHub: github.com/ethicalabs-a...

#LLMs #GenerativeAI #DataEngineering #FineTuning #OpenSource #Python #SyntheticData #RAG
GitHub - ethicalabs-ai/completionist: Command-line tools for Synthetic Datasets Generation
Command-line tools for Synthetic Datasets Generation - ethicalabs-ai/completionist
github.com
August 2, 2025 at 12:58 PM
Building and Sharing a Multimodal ViT Model for Skin Lesion Analysis: From Proof of Concept to Hugging Face App 🤗 #huggingface #MLSky #opensource hashtag#python #vit #transformers #medicalai #visionmodel #skincancer

medium.com/@massimo.sca...
Building and Sharing a Multimodal ViT Model for Skin Lesion Analysis
From Proof of Concept to Huggingface App 🤗
medium.com
August 2, 2025 at 12:36 PM
Rather than chasing benchmark supremacy or scaling wars, Kurtis E1.1 focuses on understanding, sustainability, and practical impact, especially in areas like mental health support and safer human-AI interaction

huggingface.co/blog/mrs83/k...

#MLSky #EthicalAI #LLM
Kurtis-E1.1: Supervised Fine-tuning of Qwen2.5-3B-Instruct with Flower.ai & Hugging Face
A Blog post by Massimo Roberto Scamarcia on Hugging Face
huggingface.co
April 2, 2025 at 5:49 PM
To developers: Build opt-in systems.
To policymakers: Legislate data transparency.
To artists: Unionize.
To users: Demand ethical tools.

#EthicalAI #MLSky
March 30, 2025 at 6:28 PM
Just built an offline voice assistant for macOS:
🎤 Whisper STT (MLX)
🧠 LLM via #Ollama
🗣️ XTTSv2 TTS
🌍 Optional translation
No cloud. No tracking. No vibe coding — all handcrafted.
Demo here 🎥 www.youtube.com/watch?v=8-1P...

#OnDeviceAI #LLM #Privacy #TTS #STT
Kurtis-E1-MLX-Voice-Agent
YouTube video by Massimo Scamarcia
www.youtube.com
March 24, 2025 at 11:52 PM
Testing Kurtis E1 beyond its training scope—AI ethics, decentralization, even philosophy. No hallucinations, just structured reasoning. Is this emergent? You decide.

Read more: medium.com/@massimo.sca... #LLM #AI #EthicalAI
Testing Kurtis E1
Can a Small, Fine-Tuned Model Generalize Beyond Its Training Scope?
medium.com
March 5, 2025 at 11:39 PM
Reposted by ethicalabs.ai
🧪 The Arc Institute's Evo2 models DNA like an LLM models language, predicting mutations, gene function, and evolutionary signals. With 40B parameters trained on 128K genomes, it hints at AI-driven biological discovery. 🧬💻 #MLSky
Link to the paper: https://arcinstitute.org/manuscripts/Evo2
AI can now model and design the genetic code for all domains of life with Evo 2 | Arc Institute
Arc Institute develops the largest AI model for biology to date in collaboration with NVIDIA, bringing together Stanford University, UC Berkeley, and UC San Francisco researchers
arcinstitute.org
February 24, 2025 at 3:15 PM
🌀 Ouroboros: Small models drive recursive #LLM self-refinement for synthetic datasets generation. On-device AI shaping smarter futures! #EdgeAI #OpenSourceWeek #DeepSeekR1 #Ollama medium.com/@massimo.sca...
🌀 Ouroboros: Recursive LLM Refinement with Small Models
From Synthetic Data to Recursive Self-Refinement: An On-Device Experiment
medium.com
February 22, 2025 at 9:37 PM
🚀 NVIDIA Minitron: Efficient LLM Compression!

arxiv.org/pdf/2408.11796

Minitron uses pruning + distillation to create smaller, high-performance models

🔑 Highlights:
- Teacher Correction: Adapts models to new data
- Structured #Pruning: 2.7x faster inference
- #Distillation: Uses 40x fewer tokens
arxiv.org
February 19, 2025 at 9:07 PM
Reposted by ethicalabs.ai
OpenAI scrubs diversity commitment web page from its site

OpenAI has eliminated a page on its website that used to express its commitment to diversity, equity, and inclusion. The URL “https://openai.com/commitment-to-dei/” now redirects to “https://openai.com/building-dynamic-te…

#ai #news #openai
OpenAI scrubs diversity commitment web page from its site
OpenAI has eliminated a page on its website that used to express its commitment to diversity, equity, and inclusion. The URL “https://openai.com/commitment-to-dei/” now redirects to “https://openai.com/building-dynamic-teams/,” a page that talks about people with “different backgrounds” with no use of the word “diversity.” The previous page stated that the company’s “investment in diversity, equity and inclusion” […]
techcrunch.com
February 14, 2025 at 7:42 PM
Arcee AI and AngelQ just launched KidRails for hashtag #LLMs — an open-source framework for safe, age-appropriate #AI responses for children

hypepotamus.com/startup-news...

Setting new standards in security, transparency, and responsibility 🌸 #EthicalAI #ML
This Nashville Startup Is Protecting Kids Online with Smarter, Safer AI - Hypepotamus
Nashville-based Angel Q (previously known as Angel Kids AI), got its start building a safer browser option for kids to access the internet. But browsers are not the only way of searching for […]
hypepotamus.com
February 14, 2025 at 7:44 PM
Pleias is a large language model trained exclusively on open data. It was developed using the Common Corpus, a dataset that addresses the need for high-quality compliant training data in AI development. huggingface.co/blog/Pclangl...

#opensourcellm #opendata #commoncorpus #llm #ai #ml
They Said It Couldn’t Be Done
A Blog post by Pierre-Carl Langlais on Hugging Face
huggingface.co
February 14, 2025 at 7:39 PM
Reposted by ethicalabs.ai
A naive way to generate synthetic fine-tuning data is to feed prompts to a model, collect its output, and use that as the fine-tuning set. Synthetic data is cheap, so we can afford to be more choosy. By generating responses to each prompt, we can select the one that best suits our purposes. #AI #ML
Active Inheritance, A Smarter Way to Train Models with Synthetic Data
The practice of fine-tuning models on synthetic data is becoming well established. But synthetic training data, even if it represents the training...
www.deeplearning.ai
February 6, 2025 at 12:53 PM