141 dependents
Package Description Downloads/month
A framework for few-shot evaluation of language models. 1.4M
A Neural Framework for MT Evaluation 272K
Models and examples built with TensorFlow 145K
Models and examples built with TensorFlow 68K
Model analysis tools for TensorFlow 64K
A streamlined and customizable framework for efficient large model (LLM, VLM, AI... 45K
String-to-String Algorithms for Natural Language Processing 30K
google-research t5
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Tex... 28K
FAIR Sequence Modeling Toolkit 2 23K
Open Source Neural Machine Translation and (Large) Language Models in PyTorch 23K
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backend... 22K
The Learning Interpretability Tool: Interactively analyze ML models to understan... 21K
Open-source, AI-enhanced CAT tool with multi-LLM support, translation memory, gl... 18K
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio T... 16K
A framework for evaluating language models - packaged by NVIDIA 14K
The robust European language model benchmark. 13K
Find informative examples to efficiently (human)-evaluate NLG models. 12K
OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards 11K
OpenCompass is an LLM evaluation platform, supporting a wide range of models (LL... 10K
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evalu... 9K
A full-stack, agentic workflow programming platform. Built for vibe-coding and w... 7K
Comprehensive LLM evaluation at scale: A production-ready framework for evaluati... 6K
LLM Evaluation Framework 6K
6K
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Ll... 5K
The robust European language model benchmark. 5K
A simple, consistent, and extendable module for IndicTrans2 compatible with HF m... 5K
Multi-metric evaluation toolkit supporting MT, ASR, TTS, SimulST, VC, and Parali... 5K
Dynamiq is an orchestration framework for agentic AI and LLM applications 5K
Flexible RAG (Retrieval-Augmented Generation) framework for building AI applicat... 4K
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Stream... 3K
DashAI: an interactive platform for training, evaluating and deploying AI models 3K
Evals is a framework for evaluating LLMs and LLM systems, and an open-source reg... 3K
HoloDeck - Experimentation-driven agent experimentation and deployment 3K
Automated Hyperparameter Optimization Platform for Efficient LLM Fine-Tuning 3K
ADyFT(Auto Dynamic Fine Tuning) automates parameter-efficient fine-tuning of Lar... 3K
OpenSTBench is an evaluation toolkit centered on translation and speech translat... 2K
SimulEval: A Flexible Toolkit for Automated Machine Translation Evaluation 2K
Uncertainty Estimation Toolkit for Transformer Language Models 2K
Modern LLM model evaluation for Transformers, SGLang, vLLM, TensorRT-LLM, llama.... 2K
Contain functions and classes to efficiently train a sequence to sequence to tra... 2K
FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensibl... 1K
Automate your RAG research. 1K
1K
convert embedding vectors back to text 1K
An evaluation framework for Serbian Whisper models. 1K
Evaluate simultaneous speech/text translation systems (shortform and longform) w... 1K
A transcoding library using LLMs. 1K
This library provides a comprehensive suite of metrics to evaluate the performan... 1K
Easily benchmark Machine Learning models on selected tasks and datasets 1K