197 dependents
Package Description Downloads/month
A framework for few-shot evaluation of language models. 1.4M
Interact with the Databricks Generative AI APIs in python 71K
Model analysis tools for TensorFlow 64K
A streamlined and customizable framework for efficient large model (LLM, VLM, AI... 45K
🤗 AutoTrain Advanced 34K
mellea is a library for writing generative programs 33K
String-to-String Algorithms for Natural Language Processing 30K
google-research t5
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Tex... 28K
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backend... 22K
The Learning Interpretability Tool: Interactively analyze ML models to understan... 21K
DataRobot Monitoring and Moderation framework 16K
A framework for evaluating language models - packaged by NVIDIA 14K
The robust European language model benchmark. 13K
Runsight Agent OS Core Engine 12K
OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards 11K
OpenCompass is an LLM evaluation platform, supporting a wide range of models (LL... 10K
9K
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evalu... 9K
A high level scripting API for bot builders, developers, and maintainers. 8K
A full-stack, agentic workflow programming platform. Built for vibe-coding and w... 7K
InstructLab Core package. Use this to chat with a model and execute the Instruc... 7K
Python framework which enables you to transform how a user calls or infers an IB... 6K
LLM Evaluation Framework 6K
Convert scientific posters (PDF/images) to structured JSON metadata using Large ... 6K
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Ll... 5K
The robust European language model benchmark. 5K
Simple, Pythonic building blocks to evaluate LLM applications. 5K
Dynamiq is an orchestration framework for agentic AI and LLM applications 5K
Holistic Evaluation of Language Models (HELM) is an open source Python framework... 4K
Build high-quality LLM apps - from prototyping, testing to production deployment... 4K
Advanced Machine Learning Training Platform - IN DEVELOPMENT 3K
Backend library for conversational AI in biomedicine 3K
A graph-based toolkit for evaluating LLM and RAG outputs with repeatable quality... 3K
HoloDeck - Experimentation-driven agent experimentation and deployment 3K
Automated Hyperparameter Optimization Platform for Efficient LLM Fine-Tuning 3K
We help GenAI teams maintain high-accuracy for their Models in production. 3K
ADyFT(Auto Dynamic Fine Tuning) automates parameter-efficient fine-tuning of Lar... 3K
We help GenAI teams maintain high-accuracy for their Models in production. 2K
Comprehensive NLP Evaluation System 2K
LangFair is a Python library for conducting use-case level LLM bias and fairness... 2K
Uncertainty Estimation Toolkit for Transformer Language Models 2K
TruthTorchLM is an open-source library designed to assess truthfulness in langua... 2K
Contain functions and classes to efficiently train a sequence to sequence to tra... 2K
FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensibl... 1K
Automate your RAG research. 1K
1K
convert embedding vectors back to text 1K
RadEval: A framework for radiology text evaluation 1K
An evaluation framework for Serbian Whisper models. 1K
Korean AI Project 1K