Ankit Maloo
Ankit Maloo
@ankit.bsky.social
AI/ML Research at clioapp.ai
Gen AI hitting a wall. The case against:

Ilya says current training methods are hitting a wall because we have used up all the data there was in the public domain. So, there is no more data to scale more and improve the models.
But what if we could unearth new sources of data?
December 21, 2024 at 10:23 PM
o3 seems incredibly exciting. My hypothesis is they are using MCTS at test time to arrive at an answer. Better than a bean search or lookahead search.
Seems like we will slowly move away from “generic models” and get specializations in math, stem, writing etc.
December 20, 2024 at 10:32 PM
You have two distinct datasets Dataset A (examples of tiny stories) and Dataset B (examples of recipes) with minimal overlap.
The goal is to develop a language model or a setup of multiple models that can generate relevant content based on input prompt.
How would you do it?
#ai #llm
November 28, 2024 at 10:20 PM
New research:

Adding new domain knowledge to a small 20M parameter language model.

A fun experiment to understand where LoRA, or full finetuning can work vs where to train models from scratch.

Blog: medium.com/@ankit_94177...
Paper: arxiv.org/abs/2409.17171
Cross-Domain Content Generation with Domain-Specific Small Language Models
Generating domain-specific content using small language models poses challenges, especially when dealing with multiple distinct datasets with minimal overlap. In this study, we explore methods to enab...
arxiv.org
November 25, 2024 at 8:24 PM
The latest Oppenheimer trailer looks so fab.

And great sound mixing. We can finally hear dialogues over the bg music in a Nolan movie (even if it’s in trailer)
May 8, 2023 at 1:25 PM
This looks so similar to twitter.
February 28, 2023 at 9:16 PM