Lightnews — Scholar-powered news

Thalassa

@th.alassa.pink

yeah i’m thinking this would work better as a two step process. maybe this is useful huggingface.co/jinaai/Reade...

jinaai/ReaderLM-v2 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

December 14, 2025 at 12:45 AM

Philipp Krenn

@xeraa.mastodon.social.ap.brid.gy

I spy a #jinaai model on elastic cloud 🙃

December 13, 2025 at 1:19 AM

Sung Kim

@sungkim.bsky.social

Paper:: arxiv.org/abs/2512.04032
Model: huggingface.co/jinaai/jina-...
Blog: jina.ai/news/jina-vl...

December 9, 2025 at 1:29 AM

Philipp Krenn

@xeraa.net

a more complete picture how this all fits together. and where JinaAI will also fit in

October 9, 2025 at 2:01 PM

Philipp Krenn

@xeraa.mastodon.social.ap.brid.gy

a more complete picture how this all fits together. and where JinaAI will also fit in

October 9, 2025 at 2:01 PM

Philipp Krenn

@xeraa.mastodon.social.ap.brid.gy

kicking off #elasticon NYC. I‘ll put the most relevant context (😉) in this thread — it will be a busy day
starting with joining forces with #jinaai https://www.elastic.co/blog/elastic-jina-ai 🤗

October 9, 2025 at 1:39 PM

Tim Kellogg

@timkellogg.me

multi-vector (late interaction) search like ColBERT also works, because it handles the predicate logic in cheaper latent space, but storage costs are a lot higher because, well it’s multi-vector

(fwiw Qdrant and a few other vector DBs support multi-vectors)

huggingface.co/jinaai/jina-...

jinaai/jina-colbert-v2 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

August 31, 2025 at 11:07 AM

codegrokker.bsky.social

@codegrokker.bsky.social

🔍 Diving into vector search this week! Learned how to turn text into meaningful embeddings using fastembed and Qdrant.

Using jinaai/jina-embeddings-v2-small-en to generate 512-dimensional normalized vectors for semantic search 🧮 #llmzoomcamp

July 1, 2025 at 4:39 PM

デイリーHuggingFaceトレンド

@huggingfacetrends.bsky.social

今日のHuggingFaceトレンド

jinaai/jina-embeddings-v4
このリポジトリは、Jina AIが開発した「Jina Embeddings v4」という汎用埋め込みモデルを提供します。
このモデルは、マルチモーダルおよび多言語に対応しており、特にグラフ、表、イラストなどの視覚要素を含む複雑な文書の検索に特化しています。
また、テキストマッチングやコード関連のタスクにも幅広く利用可能です。

jinaai/jina-embeddings-v4 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

July 1, 2025 at 10:20 AM

emmuzoo.bsky.social

@emmuzoo.bsky.social

Experimented with embedding models using fastembed. 🧠✨
Compared jinaai/jina-embeddings-v2-small-en vs BAAI/bge-small-en — same query, different similarities!
Model choice matters. 🔍
#LLMZoomcamp #DataTalksClub

June 30, 2025 at 4:09 PM

William J.B. Mattingly

@wjbmattingly.bsky.social

New multimodal embedding model from JinaAI has me really excited. This is a very large model, but may be really useful for certain projects. I'll be testing it out with LUX multimodal data in the next week.

Model: huggingface.co/jinaai/jina-...

June 26, 2025 at 7:45 PM

gaganarora.bsky.social

@gaganarora.bsky.social

Hence there are multiple embedding models available which we can assess to fit our needs.

Example for Unimodal which I used in this course - jinaai/jina-embeddings-v2-small-en

Multimodal models like CLIP can embed both text and images into the same space.

June 25, 2025 at 8:09 PM

Michael Günther

@michael-g-u.bsky.social

We released a new model: jina-embeddings-v4
- multilingual text-to-text and text-to-image search w/o modality gap
- also visual docs (e.g. pdfs, maps) - trained on a wider scope than DSE, ColPali, etc.
+ MRL, late interaction, etc.
🤗 huggingface.co/jinaai/jina-...
📄 arxiv.org/abs/2506.18902

jinaai/jina-embeddings-v4 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

June 25, 2025 at 2:53 PM

Sumit

@reachsumit.com

jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval

Jina AI introduces a 3.8B parameter multimodal embedding model that unifies text and image representations.

📝 arxiv.org/abs/2506.18902
👨🏽‍💻 huggingface.co/jinaai/jina-...

jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval

We introduce jina-embeddings-v4, a 3.8 billion parameter multimodal embedding model that unifies text and image representations through a novel architecture supporting both single-vector and multi-vec...

arxiv.org

June 24, 2025 at 4:23 AM

Michael Günther

@michael-g-u.bsky.social

New Multi-Modal Reranking Model (e.g. for text-to-image retrieval): jina.ai/news/jina-re...

Supports Multiple Languages and Dynamic Resolution (up to 4K)

🤗 huggingface.co/jinaai/jina-...

jina-reranker-m0: Multilingual Multimodal Document Reranker

Introducing jina-reranker-m0, our new multilingual multimodal reranker for retrieving visual documents, with SOTA performance on multilingual long documents and code searching tasks.

jina.ai

April 8, 2025 at 2:23 PM

Sumit

@reachsumit.com

ReaderLM-v2: Small Language Model for HTML to Markdown and JSON

Jina AI introduces a 1.5B parameter model that processes documents up to 512K tokens, transforming messy HTML into clean Markdown or JSON with high accuracy.

📝 arxiv.org/abs/2503.01151
👨🏽‍💻 huggingface.co/jinaai/Reade...

jinaai/ReaderLM-v2 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

March 4, 2025 at 6:27 AM

Ian Wootten 🛸

@iwootten.bsky.social

Instructions for running locally here:

huggingface.co/jinaai/Reade...

jinaai/ReaderLM-v2 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

January 31, 2025 at 4:13 PM

デイリーHuggingFaceトレンド

@huggingfacetrends.bsky.social

今日のHuggingFaceトレンド

jinaai/ReaderLM-v2
このリポジトリは、Jina AIが開発した15億パラメータの言語モデル「ReaderLM-v2」を公開するものです。
ReaderLM-v2は、HTMLをより正確に、長文にも対応してマークダウンやJSON形式に変換することに特化しており、テキスト抽出や解析のタスクに利用できます。
APIやクラウドサービス経由でも利用可能です。

jinaai/ReaderLM-v2 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

January 23, 2025 at 10:16 AM

デイリーHuggingFaceトレンド

@huggingfacetrends.bsky.social

今日のHuggingFaceトレンド

jinaai/ReaderLM-v2
このリポジトリは、Jina AIが開発した15億パラメータの言語モデル「ReaderLM-v2」を公開しています。
HTMLをMarkdownやJSONに高精度に変換することを目的とし、複数言語に対応、テキスト抽出や変換に特化しています。
APIやColab、クラウド環境での利用も可能です。

jinaai/ReaderLM-v2 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

January 21, 2025 at 10:15 AM

デイリーHuggingFaceトレンド

@huggingfacetrends.bsky.social

今日のHuggingFaceトレンド

jinaai/ReaderLM-v2
このリポジトリは、Jina AIが開発した15億パラメータの言語モデル「ReaderLM-v2」に関するものです。
ReaderLM-v2は、HTMLをマークダウンやJSON形式に変換することに特化しており、多言語対応でより長い文脈の処理能力が向上しています。
APIやColabでの利用方法も提供されています。

jinaai/ReaderLM-v2 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

January 19, 2025 at 10:14 AM

沫沫

@test7855.bsky.social

開源分享一款專門用於將HTML轉為Markdown和JSON格式的小模型

1、可以處理長文本，支援複雜格式，比如表格、嵌套列表、LaTeX公式等

2、穩定性比較好，沒有重複或循環的問題

3、支援 29種語言，包括英語、中文、日語、韓語、法語、西班牙語、葡萄牙語、德語、義大利語、俄語、越南語、泰語、阿拉伯語等

適合需要批次處理網頁或自動化網頁資料提取的場景

模型： huggingface.co/jinaai/ReaderLM-v2

#網頁轉Markdown #網頁轉JSON #ReaderLMv2

January 17, 2025 at 11:23 AM

デイリーHuggingFaceトレンド

@huggingfacetrends.bsky.social

今日のHuggingFaceトレンド

jinaai/ReaderLM-v2
このリポジトリは、Jina AIが開発した15億パラメータの言語モデルReaderLM-v2を公開しています。
ReaderLM-v2は、HTMLをMarkdownやJSONに変換することに特化しており、多言語に対応し、より正確かつ長文の処理が可能です。
Hugging Face Transformersライブラリでの利用方法や、API経由での利用方法が提供されています。

jinaai/ReaderLM-v2 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

January 17, 2025 at 10:15 AM

Jim RB

@jbohnslav.bsky.social

jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
arxiv: arxiv.org/abs/2412.08802
HF: huggingface.co/jinaai/jina-...

jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images

Contrastive Language-Image Pretraining (CLIP) is a highly effective method for aligning images and texts in a shared embedding space. These models are widely used for tasks such as cross-modal informa...

arxiv.org

December 17, 2024 at 3:20 PM

Sumit

@reachsumit.com

JINA-CLIP-V2: Multilingual Multimodal Embeddings for Text and Images

Jina AI presents an improved CLIP model that combines multilingual text and image understanding with efficient embedding compression.

📝 arxiv.org/abs/2412.08802
👨🏽‍💻 huggingface.co/jinaai/jina-...

jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images

Contrastive Language-Image Pretraining (CLIP) is a highly effective method for aligning images and texts in a shared embedding space. These models are widely used for tasks such as cross-modal informa...

arxiv.org

December 13, 2024 at 4:07 AM

Tom Aarsen

@tomaarsen.com

🧩 Integrated in Sentence Transformers as well as Jina's API.
🏦 CC BY-NC 4.0 non-commercial license - you can contact Jina for commercial use.

Very nice work by the Jina team 💪 They have an updated technical report coming soon too!
Check out the model here: huggingface.co/jinaai/jina-...

jinaai/jina-clip-v2 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

November 25, 2024 at 9:43 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news