141 dependents
| Package | Description | Downloads/month |
|---|---|---|
| A framework for few-shot evaluation of language models. | 1.4M | |
| A Neural Framework for MT Evaluation | 272K | |
| Models and examples built with TensorFlow | 145K | |
| Models and examples built with TensorFlow | 68K | |
| Model analysis tools for TensorFlow | 64K | |
| A streamlined and customizable framework for efficient large model (LLM, VLM, AI... | 45K | |
| String-to-String Algorithms for Natural Language Processing | 30K | |
| Code for the paper "Exploring the Limits of Transfer Learning with a Unified Tex... | 28K | |
| FAIR Sequence Modeling Toolkit 2 | 23K | |
| Open Source Neural Machine Translation and (Large) Language Models in PyTorch | 23K | |
| Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backend... | 22K | |
| The Learning Interpretability Tool: Interactively analyze ML models to understan... | 21K | |
| Open-source, AI-enhanced CAT tool with multi-LLM support, translation memory, gl... | 18K | |
| One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio T... | 16K | |
| A framework for evaluating language models - packaged by NVIDIA | 14K | |
| The robust European language model benchmark. | 13K | |
| Find informative examples to efficiently (human)-evaluate NLG models. | 12K | |
| OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards | 11K | |
| OpenCompass is an LLM evaluation platform, supporting a wide range of models (LL... | 10K | |
| AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evalu... | 9K | |
| A full-stack, agentic workflow programming platform. Built for vibe-coding and w... | 7K | |
| Comprehensive LLM evaluation at scale: A production-ready framework for evaluati... | 6K | |
| LLM Evaluation Framework | 6K | |
| 6K | ||
| OpenCompass is an LLM evaluation platform, supporting a wide range of models (Ll... | 5K | |
| The robust European language model benchmark. | 5K | |
| A simple, consistent, and extendable module for IndicTrans2 compatible with HF m... | 5K | |
| Multi-metric evaluation toolkit supporting MT, ASR, TTS, SimulST, VC, and Parali... | 5K | |
| Dynamiq is an orchestration framework for agentic AI and LLM applications | 5K | |
| Flexible RAG (Retrieval-Augmented Generation) framework for building AI applicat... | 4K | |
| Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Stream... | 3K | |
| DashAI: an interactive platform for training, evaluating and deploying AI models | 3K | |
| Evals is a framework for evaluating LLMs and LLM systems, and an open-source reg... | 3K | |
| HoloDeck - Experimentation-driven agent experimentation and deployment | 3K | |
| Automated Hyperparameter Optimization Platform for Efficient LLM Fine-Tuning | 3K | |
| ADyFT(Auto Dynamic Fine Tuning) automates parameter-efficient fine-tuning of Lar... | 3K | |
| OpenSTBench is an evaluation toolkit centered on translation and speech translat... | 2K | |
| SimulEval: A Flexible Toolkit for Automated Machine Translation Evaluation | 2K | |
| Uncertainty Estimation Toolkit for Transformer Language Models | 2K | |
| Modern LLM model evaluation for Transformers, SGLang, vLLM, TensorRT-LLM, llama.... | 2K | |
| Contain functions and classes to efficiently train a sequence to sequence to tra... | 2K | |
| FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensibl... | 1K | |
| Automate your RAG research. | 1K | |
| 1K | ||
| convert embedding vectors back to text | 1K | |
| An evaluation framework for Serbian Whisper models. | 1K | |
| Evaluate simultaneous speech/text translation systems (shortform and longform) w... | 1K | |
| A transcoding library using LLMs. | 1K | |
| This library provides a comprehensive suite of metrics to evaluate the performan... | 1K | |
| Easily benchmark Machine Learning models on selected tasks and datasets | 1K |