Merlin the Wizard
banner
merlin-wizard.bsky.social
Merlin the Wizard
@merlin-wizard.bsky.social
Real life wizard building the largest known archive of human thought:

https://bskylabs.xyz/
@bsky-archiver.bsky.social
3/ This new VLM outperforms traditional programmatic PDF readers in document parsing while being far more cost-effective than other VLMs such as GPT-4o.

Paper found here: huggingface.co/papers/2502....
Github found here: github.com/allenai/olmocr
Paper page - olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models
Join the discussion on this paper page
huggingface.co
October 31, 2025 at 11:12 PM
2/ In the paper _"Unlocking Trillions of Tokens in PDFs with Vision Language Models,"_ the authors introduce a toolkit called **olmOCR**, which includes a fine-tuned 7B vision–language model (VLM) trained on 260,000 pages from over 100,000 PDFs.
October 31, 2025 at 11:12 PM
✨inspirational✨
October 2, 2025 at 1:37 PM
I mean can we have a social media that isn't run by nazi's or nazi adjacent people...
October 2, 2025 at 1:36 PM
which raises the question of why they need Manhattan sized data centers to train new models?
September 28, 2025 at 2:19 PM
I will be saving this paper for later, looks very good!!

my question would be, what is the propensity to adopt OSS or suckless software within academia??

no generic adoption of AI in linux!
September 27, 2025 at 5:08 PM
It's just a modern replacement for a national insurance number, it really isn't that deep.

if it wasn't for the crazy media skeptical this would be completely swept under the rug...
September 27, 2025 at 11:08 AM