Tristan Watkins
banner
tristanwatkins.com
Tristan Watkins
@tristanwatkins.com
Microsoft technology generalist at Advania UK, with deep specialism in Identity, Security + Compliance. Windows security remains focal, with recent depth in AI (and a resurrection of latent SharePoint Enterprise Search skeelz). https://tristanwatkins.com
Wow. I can't believe it's live. I waited forever for Azure Code Signing (earlier name of this) to emerge in preview after the deprecation of the Device Guard Signing Service left us needing this, but then it never came for years and I eventually lost sight of it. Glad it's here at last I guess.
March 12, 2025 at 5:55 PM
FWIW, in the December shipmas post about o3 they said o3-mini was scheduled for release in January, but I take your broader point
January 31, 2025 at 5:10 PM
Understood. You could start here: github.com/microsoft/ke.... We normally build with Azure AI Search which includes services like Document Intelligence
GitHub - microsoft/kernel-memory: RAG architecture: index and query any data using LLM and natural language, track sources, show citations, asynchronous memory patterns.
RAG architecture: index and query any data using LLM and natural language, track sources, show citations, asynchronous memory patterns. - microsoft/kernel-memory
github.com
January 13, 2025 at 6:30 PM
Finding the right chunk size (and the related delimiter(s) to parse on) can take some experimentation. Ultimately there is a whole art to this that is common to the world of traditional indexing technologies, but your data may submit well to something cheap/cheerful like this
January 13, 2025 at 6:20 PM
There are frameworks/services to help with it if you have the appetite to dive in. Otherwise you can find delimiters to parse on (I find HTML is more reliable than MD), treat each as your chunk, verify those chunks will fit with your embeddings service, then store those embeddings in your vector DB
January 13, 2025 at 6:17 PM
This would get messy with overlap, but in the first instance for something like this I would test without overlap, since the section should have some completeness/coherence to it
January 13, 2025 at 5:33 PM