Arize AI
banner
arize.bsky.social
Arize AI
@arize.bsky.social
Arize is an AI engineering platform focused on evaluation and observability. It helps engineers develop, evaluate, and observe AI applications and agents.
By popular demand, we're covering 🎓 ​LLM-as-a-judge 101 🎓 in our next workshop! RSVP: luma.com/tmipn699

Learn how to design your eval from scratch -- including what to measure, which model to use, how to prompt effectively, and how to improve your eval.
November 11, 2025 at 4:44 PM
Thanks to Mastra for organizing the first conference for TypeScript AI developers! If you missed, it, don't fret: we have an upcoming event with Mastra at GitHub HQ in SF on building and evaluating typescript agents luma.com/l3n0qg61
November 11, 2025 at 2:10 AM
We're excited to be headed to @databricksinc.bsky.social Data + AI Summit next week!

Aparna has a session on building & evaluating self-improving agents with Arize, Databricks MLFlow, & Mosaic AI.

Info here: www.databricks.com/dataaisummit...
June 4, 2025 at 8:10 PM
Reposted by Arize AI
🐳
@arize.bsky.social OSS Prompt Playground
@arize-phoenix.bsky.social gets Deepseek support! Now you can compare outputs of all the top tier reasoning models.

Which LLM provider would you like to see next? Let us know on GitHub!

github.com/Arize-ai/pho...
May 29, 2025 at 3:09 PM
🚀 Get ready to learn from a powerhouse group of speakers at
Shack15 in SF on June 25. Builders, researchers, and leaders from @anthropic.com @microsoft.com @llamaindex.bsky.social (+ many more).

Get tickets: arize.com/observe-2025
June 3, 2025 at 2:34 AM
Reposted by Arize AI
Learn to prompt better
May 7, 2025 at 7:26 PM
Chicago event alert: We’re bringing Chicago’s AI builders together at Google’s office for hands-on sessions focused on advancing LLM-powered agents. 🚀

Join us May 19. Space is limited!

Register: lu.ma/d6mo5zxs
April 29, 2025 at 5:23 PM
Reposted by Arize AI
I'll be speaking at Arize:Observe at SHACK15 on June 25! Looking forward to exploring what’s next for AI agents & assistants. More details on my session to come. @arize.bsky.social

arize.com/observe-2025
April 14, 2025 at 3:01 PM
Join us 6/25 in SF for a full-day event focused on agent reliability and evaluation.

Hear from the people building the next generation of AI systems—it's conference by engineers, for engineers.

Most of our speakers on the site. 👀

Register: arize.com/observe-2025/
April 17, 2025 at 9:07 PM
Demo your app at this year's Observe! Fill out a short application by 4.30 to be considered for our Demo Den. Great opportunity to showcase your work to the AI community in SF.

Apply here: docs.google.com/forms/d/e/1F...
March 28, 2025 at 9:11 PM
Automate LLM performance with Arize and NVIDIA NeMo. 😎

More on that here: arize.com/blog/arize-n...
Self-Improving Agents: Automating LLM Performance Optimization using Arize and NVIDIA NeMo
The Arize integration of NVIDIA NeMo empowers AI teams with an automated, self-improving AI data flywheel to enhance LLM performance.
arize.com
March 18, 2025 at 8:17 PM
Reposted by Arize AI
For all my NYC friends! 🗽🍎

We're hosting an in-person office hours tomorrow all around LLM and Agent Evals.

Join for the free snacks/drinks, stay for the heated discussions about the validity of Pokemon-based model evaluations ⚡️🐀
LLM Evals Office Hours with Arize · Luma
Join us for an open coworking session focused on LLM and Agent Evaluations! Whether you're actively working on evaluation strategies or just exploring the…
lu.ma
March 18, 2025 at 6:20 PM
Reposted by Arize AI
🤖 Building agents, but not sure how to measure their performance?

Our newest blog post on @hf.co has you covered!

This post shows you how to use @arize-phoenix.bsky.social to trace and evaluate your smolagents.

Credit to @srichavali.bsky.social and @aymeric-roucher.bsky.social
February 28, 2025 at 5:19 PM
Managing state and memory in LLM applications is one of the biggest challenges in AI development. From conversation history to semantic switching, choosing the right approach can make or break user experience, cost, and performance.

Our latest: arize.com/blog/memory-...
Memory and State in LLM Applications
What memory really means in LLM applications, how it relates to state management, and an overview of different approaches.
arize.com
February 26, 2025 at 7:42 PM
Join us in SF on June 25 for the must-attend AI observability and evaluation event of the year!

Past speakers have included top builders and researchers driving AI innovation and tackling its most important challenges. Be a part of the conversation shaping AI’s future.

arize.com/observe-2025/
Observe 2025
Join top AI researchers and builders on June 25 in San Francisco. Explore the latest in AI observability, evaluation, and agent reliability.
arize.com
February 6, 2025 at 4:57 PM
Reposted by Arize AI
Another full house at GitHub with @arize.bsky.social!
January 16, 2025 at 2:45 AM
Quick guide to the EU AI Act for AI teams. A few things we break down in here are the risk categories (the core of the Act), navigating transparency requirements, newly drafted guidelines for general purpose AI, and more. 👇

arize.com/blog/quick-g...
Quick Guide to the EU AI Act for AI Teams
If you have any interaction with the EU market, the EU AI Act probably applies to you. This guide will help unpack the risk categories.
arize.com
January 16, 2025 at 8:43 PM
New tutorial out on agentic RAG, using @arize-phoenix.bsky.social to help you understand what's happening under the hood.

In this example, we also use @llamaindex.bsky.social to simplify query engine creation for structured and unstructured data.

www.youtube.com/watch?v=1_73...
Understanding Agentic RAG
YouTube video by Arize AI
www.youtube.com
December 20, 2024 at 10:18 PM
Integrating LLM evaluations into CI/CD pipelines can ensure reliable, consistent AI + also help you automate experimental results. Here's a tutorial with an example using @arize-phoenix.bsky.social

arize.com/blog/how-to-...
How to Add LLM Evaluations to CI/CD Pipelines
Learn how Continuous Integration and Continuous Deployment (CI/CD) can be used to evaluate large language models (LLMs) effectively.
arize.com
December 16, 2024 at 7:08 PM
If you're in the Bay Area, come out next month to our agents bootcamp @github.com HQ in SF. Hosting with @llamaindex.bsky.social + @groqinc.bsky.social 🙂

For anyone working on chatbots, virtual assistants, or complex decision-making systems.

lu.ma/agent-tracing
Developer Bootcamp: Advanced Techniques in Agent Tracing and Performance Evaluation · Luma
Join us in San Francisco for an exciting evening on building, refining, and deploying intelligent AI agents. Whether you’re working on chatbots, virtual…
lu.ma
December 12, 2024 at 10:46 PM