AgentOps
AgentOps is an observability platform for AI agents, providing monitoring, debugging, and evaluation to help developers optimize agent performance.
Monitoring and debugging tools for agent apps
AgentOps is an observability platform for AI agents, providing monitoring, debugging, and evaluation to help developers optimize agent performance.
Phoenix is an open-source observability and evaluation tool for LLM and agent applications, supporting online tracing and offline diagnosis.
DeepEval is an open-source evaluation framework for LLM applications. It provides rich evaluation metrics and tools, supporting unit testing and integration testing to help developers build reliable LLM applications.
Ragas is a framework for evaluating RAG (Retrieval Augmented Generation) systems. It provides various evaluation metrics including faithfulness, answer relevance, context precision, helping developers optimize RAG application performance.
Helicone is an open-source proxy and observability platform for LLM applications, offering request tracing, caching, and cost analytics.
Langfuse is an open-source observability platform for LLM applications, supporting tracing, evaluation, prompt versioning, and cost analytics.
TruLens is an open-source tool for evaluating and tracking LLM apps. It provides specialized evaluation for RAG applications including context relevance, groundedness, and answer relevance.