Sujee Maniyam
banner
sujee.dev
Sujee Maniyam
@sujee.dev
Developer Advocate @ Nebius | ML/Data Engineer | Open Source contributor | Technical Instructor | Author | Speaker


portfolio: https://bit.ly/sujee-dev3
2️⃣ Open-Source RAG Pipeline with Docling + Data Prep Kit + Milvus + Open LLMs
- Session: qconsf.com/training/nov...
- Repo: github.com/sujee/data-p...

We ran open source models on tokenfactory.nebius.com - DM if you want credits to try it out.

Collaborate on ; discord.gg/bk5fcvNJVZ
November 21, 2025 at 9:23 PM
Little backstory:
davenielsen.bsky.social and I started Allycat (github.com/The-AI-Alli...) as a demo project at @aialliance.bsky.social . Since then it has been adopted by many and getting contributions from others. This is how open source works.
GitHub - The-AI-Alliance/allycat: Chat with your website using LLMs
Chat with your website using LLMs. Contribute to The-AI-Alliance/allycat development by creating an account on GitHub.
github.com
October 10, 2025 at 7:46 PM
Tech-stack:
- Web crawl
- Docking for extracting data from downloaded data (HTML / PDF ..etc)
- Index and store in vector database
- llama-index framework
- open source LLMs like (Qwen3, GLM, GPT-OSS, Deepseek) powered by Nebius AI Studio

You will walk away with working code you can build on.
October 10, 2025 at 7:46 PM
Tech-stack:
- Docking for extracting data from PDFs
- Data Prep Kit for cleaning (remove sensitive PII data, remove hate-speech / spam)
- Milvus as vector database
- llama-index as the framework
- open source LLMs like (Qwen3, GLM, GPT-OSS, Deepseek) powered by Nebius AI Studio
October 6, 2025 at 9:32 PM
- RAISE Summit - 2025 Paris: sujee.github.io/ai-events-re...

- WandB fully connected 2025: sujee.github.io/ai-events-re...
August 21, 2025 at 9:30 PM