Lightnews — Scholar-powered news

Ravi V

@aidelve.bsky.social

It would be great if we get access to the recordings if it's being done. A lot of content we all would love access to.

November 6, 2025 at 4:49 AM

Ravi V

@aidelve.bsky.social

Butterfruit milkshake, as it's referred to in Bangalore.

June 11, 2025 at 9:03 AM

Ravi V

@aidelve.bsky.social

Thank you. We developed an app for a client that processes documents and presents an analysis. We used traditional RAG (chunking, vectorization, reranking,). The feedback we received was a lack of depth in the analysis. This post clears up a lot on why late interaction matters and not multi vectors

February 27, 2025 at 4:39 AM

Ravi V

@aidelve.bsky.social

I will put both into Notebooklm and see if it can put it together!

January 22, 2025 at 5:06 PM

Ravi V

@aidelve.bsky.social

Perplexity has been the goto tool for about 90% of queries for the last few months. Especially if I want a quick explanation of a concept or if I want to check for alternatives.
Has saved me a ton of effort.

December 27, 2024 at 2:39 PM

Ravi V

@aidelve.bsky.social

We have been writing JS for 10+ years on server and client. Still waiting for the "lots of runtime errors" to happen.
Validation libs on the API layer keeps it sane on server.
Client side with frameworks in place it's not difficult to enforce consistency.
We did try TS but never saw the point

December 12, 2024 at 2:49 PM

Ravi V

@aidelve.bsky.social

Next step will be the query mechanisms.
Considering Full text search of MongoDB as well.
So search would be on text,vectors and graphs.
Given the unstructured nature of the data we think multiple query mechanisms would be a good approach.
More thoughts later as we discover things.
(6/6)

December 10, 2024 at 11:11 AM

Ravi V

@aidelve.bsky.social

Step 5
Use the relationships and entities to populate the GraphDB.
We are going with Neo4j for now. We could move it to weaviate but right now the thought process is we will require a full featured GraphDB.
Both Weaviate and Neo4j store the emailId generated by MongoDB to ensure traceability.
(5/n)

December 10, 2024 at 11:06 AM

Ravi V

@aidelve.bsky.social

Step 4
Create Vector embedings from Mongo Data (email and attachment text) and store in Weaviate.
Again lot of choices in chunking and embedding models. Weaviate documentation is helpful in understanding the choices. More on the choices to be made later.
(4/n)

December 10, 2024 at 11:02 AM

Ravi V

@aidelve.bsky.social

Step 3
Use Spacy/Rebel pipeline to extract
1. Entities
2. Relationships
Experimenting with different entity and relationship extraction models. Will publish the findings later.
Update Mongo with the extracted text, entities,relationships, and tables.
(3/n)

December 10, 2024 at 10:58 AM

Ravi V

@aidelve.bsky.social

Step 2
For each attachment we use Docling/Spacy pipeline to extract
1. Text
2. Tables
3. Images (planned)
Right now converting xls to pdf and extracting tables using spacy layout seems to be working well.
(2/n)

December 10, 2024 at 10:54 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news