Dependents of deepeval

47 dependents

Package	Description	Downloads/month
deepteam	DeepTeam is a framework to red team LLMs and LLM systems.	56K
restai-core	RESTAI, so many 'A's and 'I's, so little time...	17K
floeval	A multi-backend evaluation framework for LLM, RAG, and agentic systems.	4K
ianl-agent	IaNL: Infrastructure as Natural Language — AI agent that provisions AWS resource...	4K
protollm	Framework for prototyping of LLM-based applications	4K
automa-ai	PNNL Auto Multi-Agent AI: Dynamic multi-agent system for building applications	3K
janus-labs	Benchmark framework for AI coding assistants	3K
holodeck-ai	HoloDeck - Experimentation-driven agent experimentation and deployment	3K
rhesis-sdk	The testing platform for AI teams. Bring engineers, PMs, and domain experts toge...	2K
crewmaster	A powerful and flexible framework for building, orchestrating, and deploying mul...	2K
projectkate	Open source observability and auto-evals for AI agents	1K
core-ai-evaluation	Evaluation framework for CoreAI.	1K
splunk-otel-genai-evals-deepeval	OpenTelemetry GenAI Utils	1K
thetiscore	Service to examine data processing pipelines (e.g., machine learning or deep lea...	1K
deepeval-haystack	Additional packages (components, document stores and the likes) to extend the ca...	902
flotorch-eval	A comprehensive evaluation framework for AI systems	803
rag-sentinel	RAG Evaluation Framework using Ragas metrics and MLflow tracking	698
llama-index-callbacks-deepeval	llama-index callbacks deepeval integration	572
mce-deepeval-adapter	DeepEval integration adapter for Metrics Computation Engine	440
sqlthought	Building DB, asking natural language questions through agents, and evaluate	413
gen-eval	Project GenEval: A Unified Evaluation Framework for Generative AI Applications	376
deepeval-plugin		374
psalm-eu	Probing Stylistic Appropriation in Large Language Models (PSALM): An LLM-as-a-Ju...	361
coolprompt	Automatic Prompt Optimization Framework	314
hinteval	A Python framework designed for both generating and evaluating hints.	301
litesuite	free google results	255
ragrid	This library is to search the best parameters across different steps of the RAG ...	221
qalitydeep	Pre-deploy CI/CD QA dashboard for evaluating LLM and AI-agent outputs	213
examinationrag	XRAG: eXamining the Core - Benchmarking Foundational Component Modules in Advanc...	204
atgen	Toolkit for Active Learning in Generative Tasks	203
custom-llm-eval	A comprehensive framework for evaluating Large Language Models with built-in sup...	148
metabeeai	MetaBeeAI LLM Pipeline for PDF processing and data extraction	132
hugme	HuGME is an easy-to-use LLM assessment framework, which explicitly rates the Hun...	129
protollm-publish-test		125
evallite	A powerful web content fetcher and processor	117
nova-agent-runtime	Python framework for building AI agents with tool integration, multi-agent workf...	111
mseep-cognee-mcp	A MCP server project	103
aisuiteplus	Uniform access layer for LLMs	102
algoleap-testing	Similarity evaluator using Deepevalve	99
rag-meter	RAGMeter is a universal evaluation toolkit designed to assess the performance of...	93
metacoder	Add your description here	85
evaltool	Validation framework with feedback loop	80
factly-eval	CLI tool to evaluate LLM factuality on MMLU benchmark.	67
ragaasfirstmoduletest	My first RAGAAS package and test	62
llmevaluator	A package for evaluating text files.	5
adaptive-router	Nordlys is an AI lab building a Mixture of Models. This repository contains the ...	4
agent-runtime-nova	Python framework for building AI agents with tool integration, multi-agent workf...	1