Lightnews — Scholar-powered news

Roopal Garg

@roopalgarg.bsky.social

880 followers 880 following 5 posts

Multimodal Multi-lingual research at Google DeepMind for Gemini post-training.
#NLProc #Multimodal

Posts Replies Media Videos

Pinned

Roopal Garg @roopalgarg.bsky.social · Nov 21

📢 Excited to unveil our latest research, ImageInWords (IIW)! 🚀We're pushing the boundaries of image descriptions with a new seeded, sequential, human-in-the-loop approach producing SoTA, articulate, hyper-detailed descriptions.

arXiv: arxiv.org/abs/2405.02793
#NLProc #ComputerVision #Multimodal

ImageInWords: Unlocking Hyper-Detailed Image Descriptions

Despite the longstanding adage "an image is worth a thousand words," generating accurate hyper-detailed image descriptions remains unsolved. Trained on short web-scraped image text, vision-language mo...

arxiv.org

Reposted by Roopal Garg

Jeff Dean

@jeffdean.bsky.social

🥁Introducing Gemini 2.5, our most intelligent model with impressive capabilities in advanced reasoning and coding.

Now integrating thinking capabilities, 2.5 Pro Experimental is our most performant Gemini model yet. It’s #1 on the LM Arena leaderboard. 🥇

March 25, 2025 at 5:25 PM

Roopal Garg

@roopalgarg.bsky.social

folks working on one or more of the following

🖼️ Image Descriptions to improve Image-Text alignment
AND/OR
💬Multi/Cross Lingual image-text understanding/generation
AND/OR
🌏Geo-Cultural representation and learning

Please DM if you are willing to discuss the current state/challenges/future-work.

November 25, 2024 at 6:57 AM

Reposted by Roopal Garg

Marc Lanctot

@sharky6000.bsky.social

New starter pack! go.bsky.app/GZ4hZzu

October 28, 2024 at 9:43 AM

Roopal Garg

@roopalgarg.bsky.social

We had a great experience presenting our work on ImageInWords to the community #EMNLP2024 . Thank you everyone for stopping by🙏! Looking forward to future work and seeing image descriptions as a foundational multi-modal task! @emnlpmeeting.bsky.social @deep-mind.bsky.social #NLProc #Multimodal

November 23, 2024 at 10:53 PM

Reposted by Roopal Garg

ACL

@aclmeeting.bsky.social

All the ACL chapters are here now: @aaclmeeting.bsky.social @emnlpmeeting.bsky.social @eaclmeeting.bsky.social @naaclmeeting.bsky.social #NLProc

November 19, 2024 at 3:48 AM

Reposted by Roopal Garg

Robert Riachi

@robertriachi.bsky.social

hello new followers! we’re actively hiring on our generative media team in Mountain View: boards.greenhouse.io/deepmind/job...

we work on image, video, audio, etc… come work with us if you’re interested! apply asap :)

Research Engineer, GenMedia

Mountain View, California, US

boards.greenhouse.io

November 22, 2024 at 6:08 AM

Roopal Garg

@roopalgarg.bsky.social

ImageInWords: Unlocking Hyper-Detailed Image Descriptions

arxiv.org

November 21, 2024 at 12:26 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news