Chris Pal
banner
chrisjpal.bsky.social
Chris Pal
@chrisjpal.bsky.social
Professor,
Mila, Polytechnique Montreal, DIRO, UdeM
Distinguished Scientist, ServiceNow Research
https://sites.google.com/view/christopher-pal
Reposted by Chris Pal
2025 BERT is NeoBERT! We have fully pre-trained a next-generation encoder for 2.1T tokens with the latest advances in data, training, and architecture. This is a heroic effort from my PhD student, Lola Le Breton, in collaboration with Quentin Fournier and Mariam El Mezouar (1/n)
February 28, 2025 at 4:30 PM
Reposted by Chris Pal
🎉 Excited to introduce BigDocs!
An open, transparent multimodal dataset designed for:
📄 Documents
🌐 Web content
🖥️ GUI understanding
👨‍💻 Code generation from images
We’re also launching BigDocs-Bench:
➡️ Document, Web, GUI Visual reasoning
➡️ Converting images into JSON, Markdown, LaTeX, SVG, and more!
December 10, 2024 at 6:34 PM
LLMs have a lot of potential for science, but scientists can be particularly sensitive to factuality, nuances, and hallucinations. The new ScholarQABench benchmark in this paper looks pretty useful for the community to monitor progress on LLMs for science. arxiv.org/html/2411.14199
November 25, 2024 at 1:20 AM