That’s the vision.
🔗 arxiv.org/abs/2503.21889
📝 tinyurl.com/3utdbn97
Thanks to @joanrod.bsky.social, @perouz.bsky.social, @spandanagella.bsky.social and all co-authors!
#AI #VLM #WorkflowAutomation #Sketch2Flow #arXiv
That’s the vision.
🔗 arxiv.org/abs/2503.21889
📝 tinyurl.com/3utdbn97
Thanks to @joanrod.bsky.social, @perouz.bsky.social, @spandanagella.bsky.social and all co-authors!
#AI #VLM #WorkflowAutomation #Sketch2Flow #arXiv
• Models struggle most with handwritten & whiteboard sketches
• UI screenshots are easiest
• End-to-end generation beats decomposed pipelines
• Finetuning on diverse sketch data is key to generalization
• Models struggle most with handwritten & whiteboard sketches
• UI screenshots are easiest
• End-to-end generation beats decomposed pipelines
• Finetuning on diverse sketch data is key to generalization
📈 Finetuned open models outperform proprietary ones:
Qwen2.5-VL-7B → FlowSim: 0.614
GPT-4o → FlowSim: 0.786
𝐐𝐰𝐞𝐧𝟐.𝟓-𝐕𝐋-𝟕𝐁 (𝐟𝐢𝐧𝐞𝐭𝐮𝐧𝐞𝐝) → 𝐅𝐥𝐨𝐰𝐒𝐢𝐦: 𝟎.𝟗𝟓𝟕
📈 Finetuned open models outperform proprietary ones:
Qwen2.5-VL-7B → FlowSim: 0.614
GPT-4o → FlowSim: 0.786
𝐐𝐰𝐞𝐧𝟐.𝟓-𝐕𝐋-𝟕𝐁 (𝐟𝐢𝐧𝐞𝐭𝐮𝐧𝐞𝐝) → 𝐅𝐥𝐨𝐰𝐒𝐢𝐦: 𝟎.𝟗𝟓𝟕
• Synthetic (Graphviz)
• Manual (hand-drawn)
• Whiteboard
• Digital
• UI screenshots
These were paired with structured JSON workflow outputs for training and evaluation.
• Synthetic (Graphviz)
• Manual (hand-drawn)
• Whiteboard
• Digital
• UI screenshots
These were paired with structured JSON workflow outputs for training and evaluation.
Workflow automation is powerful—but authoring flows is still complex, even with low-code tools.
💫𝐒𝐭𝐚𝐫𝐅𝐥𝐨𝐰 explores a simpler interface: 𝐣𝐮𝐬𝐭 𝐝𝐫𝐚𝐰 𝐢𝐭.
Imagine sketching a workflow on a whiteboard and getting a runnable flow in return.
Workflow automation is powerful—but authoring flows is still complex, even with low-code tools.
💫𝐒𝐭𝐚𝐫𝐅𝐥𝐨𝐰 explores a simpler interface: 𝐣𝐮𝐬𝐭 𝐝𝐫𝐚𝐰 𝐢𝐭.
Imagine sketching a workflow on a whiteboard and getting a runnable flow in return.
* Build balanced training datasets for real-world tasks
* Learn how to handle data imbalance
* Get insights on how to design for at-scale deployment
arxiv.org/abs/2501.04652
* Build balanced training datasets for real-world tasks
* Learn how to handle data imbalance
* Get insights on how to design for at-scale deployment
arxiv.org/abs/2501.04652
* One retriever for many use cases
* Works across languages! 🌍
* Handles structured data like workflows
* Lightweight & fast for production
* Generalizes to new domains & tasks
* One retriever for many use cases
* Works across languages! 🌍
* Handles structured data like workflows
* Lightweight & fast for production
* Generalizes to new domains & tasks
Multi-task instruction fine-tuning FTW! Our approach beats both BM25 and strong off-the-shelf encoder models across all retrieval tasks (in-distribution and out-of-distribution).
Multi-task instruction fine-tuning FTW! Our approach beats both BM25 and strong off-the-shelf encoder models across all retrieval tasks (in-distribution and out-of-distribution).
* RAG needs domain-specific knowledge
* Multiple apps = multiple retrievers = 💰
* Different types of data (steps, tables, fields, ...)
* RAG needs domain-specific knowledge
* Multiple apps = multiple retrievers = 💰
* Different types of data (steps, tables, fields, ...)
If this sounds exciting, follow us! We’ve got more papers and insights on the way—don’t miss out! 🚀
If this sounds exciting, follow us! We’ve got more papers and insights on the way—don’t miss out! 🚀
1. Outlining the workflow structure
2. Populating inputs for each step
Each sub-task is easier to solve and test, boosting the system’s modularity and maintainability.
1. Outlining the workflow structure
2. Populating inputs for each step
Each sub-task is easier to solve and test, boosting the system’s modularity and maintainability.