Dependents of evaluate

280 dependents

Package	Description	Downloads/month
lm-eval	A framework for few-shot evaluation of language models.	1.4M
autogluon-multimodal	Fast and Accurate ML in 3 Lines of Code	859K
setfit	Efficient few-shot learning with Sentence Transformers	214K
pytdc	Therapeutics Commons (TDC): Multimodal Foundation for Therapeutic Science	127K
databricks-genai	Interact with the Databricks Generative AI APIs in python	71K
amd-quark	AMD Quark is a comprehensive cross-platform toolkit designed to simplify and enh...	65K
mtmtrain		36K
libwebarena	This is an unofficial, use-at-your-own risks port of the webarena benchmark, for...	34K
autotrain-advanced	🤗 AutoTrain Advanced	34K
unitxt	🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, ...	33K
libvisualwebarena	This is an unofficial, use-at-your-own risks port of the visualwebarena benchmar...	33K
axolotl	Go ahead and axolotl questions	20K
lmms-eval	One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio T...	16K
tensorrt-llm	TensorRT LLM provides users with an easy-to-use Python API to define Large Langu...	16K
span-marker	SpanMarker for Named Entity Recognition	15K
nvidia-lm-eval	A framework for evaluating language models - packaged by NVIDIA	14K
scandeval	The robust European language model benchmark.	13K
dataquality	Python SDK for Galileo's NLP and CV Studio.	11K
ms-opencompass	OpenCompass is an LLM evaluation platform, supporting a wide range of models (LL...	10K
domino-code-assist		9K
autorag	AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evalu...	9K
wisent-evaluators	Benchmark evaluators split from wisent	8K
lamini	Build on large language models faster	8K
opensportslib	OpenSportsLib is the professional library, designed for advanced video understan...	7K
bellek	My digital memory	6K
calibrate-agent	An open-source evaluation framework for voice agents	6K
llm-eval-toolkit	LLM Evaluation Framework	6K
flexeval		6K
opencompass	OpenCompass is an LLM evaluation platform, supporting a wide range of models (Ll...	5K
dllm-reason	DAG-guided discrete diffusion language models for reasoning	5K
euroeval	The robust European language model benchmark.	5K
docling-metrics-text	Text metrics	4K
convokit	ConvoKit is a toolkit for extracting conversational features and analyzing socia...	4K
ultimate-utils	Brando's Ultimate Utils for Science, Machine Learning, and AI	4K
deepchopper	Genomic Language Model Mitigates Chimera Artifacts in Nanopore Direct RNA Sequen...	4K
unimernet	UniMERNet: A Universal Network for Real-World Mathematical Expression Recognitio...	3K
aitraining	Advanced Machine Learning Training Platform - IN DEVELOPMENT	3K
dashai	DashAI: an interactive platform for training, evaluating and deploying AI models	3K
clarinpl-embeddings	Embeddings: State-of-the-art Text Representations for Natural Language Processin...	3K
evals	Evals is a framework for evaluating LLMs and LLM systems, and an open-source reg...	3K
biochatter	Backend library for conversational AI in biomedicine	3K
text2story		3K
holodeck-ai	HoloDeck - Experimentation-driven agent experimentation and deployment	3K
open-dataflow	Modern Data Centric AI system for Large Language Models	3K
urartu	An open-source NLP framework that offers high-level wrappers designed for effort...	2K
jury	Comprehensive NLP Evaluation System	2K
bioamla	A Python package for audio analysis and machine learning-based audio classificat...	2K
safe-mol	A single model for all your molecular design tasks	2K
pyatp	a iflytek ailab library ...	2K
ares-ai	Automated Evaluation of RAG Systems	2K