Ankit
ankits0052.bsky.social
Ankit
@ankits0052.bsky.social
AI Research Enthusiast | Multimedia Analysis | LLMs
Associate director at Accenture.
PhD in LTI CMU, prev: Google, Bosch, merl, ARM
Looking for the next breakthrough that will lead to AGI - understanding why LLMs actually work
ankitshah009.github.io
3. SuperAGI is a dev-first open source autonomous AI agent framework to build, manage & run useful autonomous agents. You can run concurrent agents seamlessly, extend agent capabilities with tools.
March 13, 2025 at 6:51 AM
2. AG2 is an open-source framework for building and coordinating multiple AI agents using LLMs, supporting tool use and human-in-the-loop interaction.

It simplifies agent creation, communication, and workflow management through pre-built patterns and configurable options.
March 13, 2025 at 6:51 AM
3. A top alignment rate of 84.3% was measured with ASIMOV Benchmark using generated constitutions, outperforming no-constitution baselines and human-written constitutions.
March 13, 2025 at 5:11 AM
A framework to automatically generate robot constitutions from real-world data to steer a robot's behavior using Constitutional AI mechanisms.
March 13, 2025 at 5:11 AM
2. The ASIMOV benchmark is a large-scale and comprehensive collection of datasets for evaluating and improving semantic safety of foundation models serving as robot brains to generate data under undesirable situations from real-world visual scenes for better robot scene understanding.
March 13, 2025 at 5:11 AM
New Dataset & Benchmarks:
1. ASIMOV Dataset for measuring safety implications of robotic actions in real-world scenarios.
March 13, 2025 at 5:11 AM
Some limitations for Gemini Robotics-ER stated in the report include struggles in spatial relationships across long videos and still ways to go for fine-grained robot control.
March 13, 2025 at 5:10 AM
ERQA (Embodied Reasoning Question Answering) is the benchmark introduced for embodied reasoning for VLMs. With over 400 MCVQs in spatial and action reasoning, trajectory reasoning, state estimation, task reasoning and more. It's similar to existing VLM benchmarks.
March 13, 2025 at 5:10 AM
Gemini Robotics-ER VLM can enable spatial understanding, trajectory prediction, precise pointing and multi-view. The VLM brings foundational work for real-world robotics applications via zero-shot and few-shot adaptation for perception, planning and code generation to control robot embodiments.
March 13, 2025 at 5:09 AM
They also introduced a new dataset & framework for robot constitutions👇

New Models:
Gemini Robotics taps into Gemini's world understanding to generalize to novel situations and solve a wide variety of tasks out of the box, including tasks it has never seen before in training.
March 13, 2025 at 5:09 AM
10. Idea Refinement:

"Take my rough concept—[e.g., 'a platform for decentralized education']—and explore similar ideas on X and the web. Provide a 500-word report on existing implementations, potential challenges, and 5 actionable next steps for development."
March 2, 2025 at 2:22 AM
9. Image-Inspired Writing:

"Search X for the 5 most shared images related to [theme, e.g., climate change impacts] in the last week. For each, write a 200-word fictional vignette inspired by the image, and ask if I’d like you to generate a complementary image."
March 2, 2025 at 2:22 AM
8. Debate Prep:

"Find the 10 most influential X posts on [controversial issue, e.g., universal basic income] from the past month. Summarize each stance, then draft two 300-word opposing arguments I can refine for a debate script."
March 2, 2025 at 2:22 AM
7. Historical Context:

"Research the evolution of [concept, e.g., cryptocurrency regulation] over the past 5 years using X posts and web sources. Create a timeline with 10 key events and a 700-word narrative explaining their significance for a blog post."
March 2, 2025 at 2:21 AM
6. Document Breakdown:

"Analyze this uploaded PDF—[assume user uploads a research paper]—and extract its main arguments, methodology, and conclusions. Then, write a 400-word critique assessing its strengths and gaps, suggesting 3 follow-up research questions."
March 2, 2025 at 2:21 AM
5. Creative Brainstorming:

"Generate 15 unique story ideas based on emerging trends in [field, e.g., biotechnology]. For each, provide a one-sentence premise, a potential protagonist, and a key conflict, drawing from current web and X conversations."
March 2, 2025 at 2:20 AM