#jinaai
yeah i’m thinking this would work better as a two step process. maybe this is useful huggingface.co/jinaai/Reade...
jinaai/ReaderLM-v2 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
December 14, 2025 at 12:45 AM
I spy a #jinaai model on elastic cloud 🙃
December 13, 2025 at 1:19 AM
December 9, 2025 at 1:29 AM
a more complete picture how this all fits together. and where JinaAI will also fit in
October 9, 2025 at 2:01 PM
a more complete picture how this all fits together. and where JinaAI will also fit in
October 9, 2025 at 2:01 PM
kicking off #elasticon NYC. I‘ll put the most relevant context (😉) in this thread — it will be a busy day
starting with joining forces with #jinaai https://www.elastic.co/blog/elastic-jina-ai 🤗
October 9, 2025 at 1:39 PM
multi-vector (late interaction) search like ColBERT also works, because it handles the predicate logic in cheaper latent space, but storage costs are a lot higher because, well it’s multi-vector

(fwiw Qdrant and a few other vector DBs support multi-vectors)

huggingface.co/jinaai/jina-...
jinaai/jina-colbert-v2 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
August 31, 2025 at 11:07 AM
🔍 Diving into vector search this week! Learned how to turn text into meaningful embeddings using fastembed and Qdrant.

Using jinaai/jina-embeddings-v2-small-en to generate 512-dimensional normalized vectors for semantic search 🧮 #llmzoomcamp
July 1, 2025 at 4:39 PM
今日のHuggingFaceトレンド

jinaai/jina-embeddings-v4
このリポジトリは、Jina AIが開発した「Jina Embeddings v4」という汎用埋め込みモデルを提供します。
このモデルは、マルチモーダルおよび多言語に対応しており、特にグラフ、表、イラストなどの視覚要素を含む複雑な文書の検索に特化しています。
また、テキストマッチングやコード関連のタスクにも幅広く利用可能です。
jinaai/jina-embeddings-v4 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
July 1, 2025 at 10:20 AM
Experimented with embedding models using fastembed. 🧠✨
Compared jinaai/jina-embeddings-v2-small-en vs BAAI/bge-small-en — same query, different similarities!
Model choice matters. 🔍
#LLMZoomcamp #DataTalksClub
June 30, 2025 at 4:09 PM
New multimodal embedding model from JinaAI has me really excited. This is a very large model, but may be really useful for certain projects. I'll be testing it out with LUX multimodal data in the next week.

Model: huggingface.co/jinaai/jina-...
June 26, 2025 at 7:45 PM
Hence there are multiple embedding models available which we can assess to fit our needs.

Example for Unimodal which I used in this course - jinaai/jina-embeddings-v2-small-en

Multimodal models like CLIP can embed both text and images into the same space.
June 25, 2025 at 8:09 PM
We released a new model: jina-embeddings-v4
- multilingual text-to-text and text-to-image search w/o modality gap
- also visual docs (e.g. pdfs, maps) - trained on a wider scope than DSE, ColPali, etc.
+ MRL, late interaction, etc.
🤗 huggingface.co/jinaai/jina-...
📄 arxiv.org/abs/2506.18902
jinaai/jina-embeddings-v4 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
June 25, 2025 at 2:53 PM
jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval

Jina AI introduces a 3.8B parameter multimodal embedding model that unifies text and image representations.

📝 arxiv.org/abs/2506.18902
👨🏽‍💻 huggingface.co/jinaai/jina-...
jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval
We introduce jina-embeddings-v4, a 3.8 billion parameter multimodal embedding model that unifies text and image representations through a novel architecture supporting both single-vector and multi-vec...
arxiv.org
June 24, 2025 at 4:23 AM
New Multi-Modal Reranking Model (e.g. for text-to-image retrieval): jina.ai/news/jina-re...

Supports Multiple Languages and Dynamic Resolution (up to 4K)

🤗 huggingface.co/jinaai/jina-...
jina-reranker-m0: Multilingual Multimodal Document Reranker
Introducing jina-reranker-m0, our new multilingual multimodal reranker for retrieving visual documents, with SOTA performance on multilingual long documents and code searching tasks.
jina.ai
April 8, 2025 at 2:23 PM
ReaderLM-v2: Small Language Model for HTML to Markdown and JSON

Jina AI introduces a 1.5B parameter model that processes documents up to 512K tokens, transforming messy HTML into clean Markdown or JSON with high accuracy.

📝 arxiv.org/abs/2503.01151
👨🏽‍💻 huggingface.co/jinaai/Reade...
jinaai/ReaderLM-v2 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
March 4, 2025 at 6:27 AM
Instructions for running locally here:

huggingface.co/jinaai/Reade...
jinaai/ReaderLM-v2 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
January 31, 2025 at 4:13 PM
今日のHuggingFaceトレンド

jinaai/ReaderLM-v2
このリポジトリは、Jina AIが開発した15億パラメータの言語モデル「ReaderLM-v2」を公開するものです。
ReaderLM-v2は、HTMLをより正確に、長文にも対応してマークダウンやJSON形式に変換することに特化しており、テキスト抽出や解析のタスクに利用できます。
APIやクラウドサービス経由でも利用可能です。
jinaai/ReaderLM-v2 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
January 23, 2025 at 10:16 AM
今日のHuggingFaceトレンド

jinaai/ReaderLM-v2
このリポジトリは、Jina AIが開発した15億パラメータの言語モデル「ReaderLM-v2」を公開しています。
HTMLをMarkdownやJSONに高精度に変換することを目的とし、複数言語に対応、テキスト抽出や変換に特化しています。
APIやColab、クラウド環境での利用も可能です。
jinaai/ReaderLM-v2 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
January 21, 2025 at 10:15 AM
今日のHuggingFaceトレンド

jinaai/ReaderLM-v2
このリポジトリは、Jina AIが開発した15億パラメータの言語モデル「ReaderLM-v2」に関するものです。
ReaderLM-v2は、HTMLをマークダウンやJSON形式に変換することに特化しており、多言語対応でより長い文脈の処理能力が向上しています。
APIやColabでの利用方法も提供されています。
jinaai/ReaderLM-v2 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
January 19, 2025 at 10:14 AM
開源分享 一款專門用於將HTML轉為Markdown和JSON格式的小模型

1、可以處理長文本,支援複雜格式,比如表格、嵌套列表、LaTeX公式等

2、穩定性比較好,沒有重複或循環的問題

3、支援 29種語言,包括英語、中文、日語、韓語、法語、西班牙語、葡萄牙語、德語、義大利語、俄語、越南語、泰語、阿拉伯語等

適合需要批次處理網頁或自動化網頁資料提取的場景

模型: huggingface.co/jinaai/ReaderLM-v2

#網頁轉Markdown #網頁轉JSON #ReaderLMv2
January 17, 2025 at 11:23 AM
今日のHuggingFaceトレンド

jinaai/ReaderLM-v2
このリポジトリは、Jina AIが開発した15億パラメータの言語モデルReaderLM-v2を公開しています。
ReaderLM-v2は、HTMLをMarkdownやJSONに変換することに特化しており、多言語に対応し、より正確かつ長文の処理が可能です。
Hugging Face Transformersライブラリでの利用方法や、API経由での利用方法が提供されています。
jinaai/ReaderLM-v2 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
January 17, 2025 at 10:15 AM
JINA-CLIP-V2: Multilingual Multimodal Embeddings for Text and Images

Jina AI presents an improved CLIP model that combines multilingual text and image understanding with efficient embedding compression.

📝 arxiv.org/abs/2412.08802
👨🏽‍💻 huggingface.co/jinaai/jina-...
jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
Contrastive Language-Image Pretraining (CLIP) is a highly effective method for aligning images and texts in a shared embedding space. These models are widely used for tasks such as cross-modal informa...
arxiv.org
December 13, 2024 at 4:07 AM
🧩 Integrated in Sentence Transformers as well as Jina's API.
🏦 CC BY-NC 4.0 non-commercial license - you can contact Jina for commercial use.

Very nice work by the Jina team 💪 They have an updated technical report coming soon too!
Check out the model here: huggingface.co/jinaai/jina-...
jinaai/jina-clip-v2 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
November 25, 2024 at 9:43 AM