Dependents of rouge-score

197 dependents

Package	Description	Downloads/month
lm-eval	A framework for few-shot evaluation of language models.	1.4M
databricks-genai	Interact with the Databricks Generative AI APIs in python	71K
tensorflow-model-analysis	Model analysis tools for TensorFlow	64K
evalscope	A streamlined and customizable framework for efficient large model (LLM, VLM, AI...	45K
autotrain-advanced	🤗 AutoTrain Advanced	34K
mellea	mellea is a library for writing generative programs	33K
string2string	String-to-String Algorithms for Natural Language Processing	30K
t5	Code for the paper "Exploring the Limits of Transfer Learning with a Unified Tex...	28K
lighteval	Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backend...	22K
lit-nlp	The Learning Interpretability Tool: Interactively analyze ML models to understan...	21K
datarobot-moderations	DataRobot Monitoring and Moderation framework	16K
nvidia-lm-eval	A framework for evaluating language models - packaged by NVIDIA	14K
scandeval	The robust European language model benchmark.	13K
runsight-core	Runsight Agent OS Core Engine	12K
py-openjudge	OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards	11K
ms-opencompass	OpenCompass is an LLM evaluation platform, supporting a wide range of models (LL...	10K
domino-code-assist		9K
autorag	AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evalu...	9K
dfcx-scrapi	A high level scripting API for bot builders, developers, and maintainers.	8K
orcheo	A full-stack, agentic workflow programming platform. Built for vibe-coding and w...	7K
instructlab	InstructLab Core package. Use this to chat with a model and execute the Instruc...	7K
granite-io	Python framework which enables you to transform how a user calls or infers an IB...	6K
llm-eval-toolkit	LLM Evaluation Framework	6K
poster2json	Convert scientific posters (PDF/images) to structured JSON metadata using Large ...	6K
opencompass	OpenCompass is an LLM evaluation platform, supporting a wide range of models (Ll...	5K
euroeval	The robust European language model benchmark.	5K
langcheck	Simple, Pythonic building blocks to evaluate LLM applications.	5K
dynamiq	Dynamiq is an orchestration framework for agentic AI and LLM applications	5K
crfm-helm	Holistic Evaluation of Language Models (HELM) is an open source Python framework...	4K
promptflow-evals	Build high-quality LLM apps - from prototyping, testing to production deployment...	4K
aitraining	Advanced Machine Learning Training Platform - IN DEVELOPMENT	3K
biochatter	Backend library for conversational AI in biomedicine	3K
nexa-gauge	A graph-based toolkit for evaluating LLM and RAG outputs with repeatable quality...	3K
holodeck-ai	HoloDeck - Experimentation-driven agent experimentation and deployment	3K
ellora	Automated Hyperparameter Optimization Platform for Efficient LLM Fine-Tuning	3K
ai-evaluation	We help GenAI teams maintain high-accuracy for their Models in production.	3K
auto-lora	ADyFT(Auto Dynamic Fine Tuning) automates parameter-efficient fine-tuning of Lar...	3K
futureagi	We help GenAI teams maintain high-accuracy for their Models in production.	2K
jury	Comprehensive NLP Evaluation System	2K
langfair	LangFair is a Python library for conducting use-case level LLM bias and fairness...	2K
lm-polygraph	Uncertainty Estimation Toolkit for Transformer Language Models	2K
truthtorchlm	TruthTorchLM is an open-source library designed to assess truthfulness in langua...	2K
translate-package	Contain functions and classes to efficiently train a sequence to sequence to tra...	2K
flagai	FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensibl...	1K
autorag-research	Automate your RAG research.	1K
pharia-studio-sdk		1K
vec2text	convert embedding vectors back to text	1K
radeval	RadEval: A framework for radiology text evaluation	1K
whisper-eval-serbian	An evaluation framework for Serbian Whisper models.	1K
koai	Korean AI Project	1K