197 dependents
| Package | Description | Downloads/month |
|---|---|---|
| A framework for few-shot evaluation of language models. | 1.4M | |
| Interact with the Databricks Generative AI APIs in python | 71K | |
| Model analysis tools for TensorFlow | 64K | |
| A streamlined and customizable framework for efficient large model (LLM, VLM, AI... | 45K | |
| 🤗 AutoTrain Advanced | 34K | |
| mellea is a library for writing generative programs | 33K | |
| String-to-String Algorithms for Natural Language Processing | 30K | |
| Code for the paper "Exploring the Limits of Transfer Learning with a Unified Tex... | 28K | |
| Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backend... | 22K | |
| The Learning Interpretability Tool: Interactively analyze ML models to understan... | 21K | |
| DataRobot Monitoring and Moderation framework | 16K | |
| A framework for evaluating language models - packaged by NVIDIA | 14K | |
| The robust European language model benchmark. | 13K | |
| Runsight Agent OS Core Engine | 12K | |
| OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards | 11K | |
| OpenCompass is an LLM evaluation platform, supporting a wide range of models (LL... | 10K | |
| 9K | ||
| AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evalu... | 9K | |
| A high level scripting API for bot builders, developers, and maintainers. | 8K | |
| A full-stack, agentic workflow programming platform. Built for vibe-coding and w... | 7K | |
| InstructLab Core package. Use this to chat with a model and execute the Instruc... | 7K | |
| Python framework which enables you to transform how a user calls or infers an IB... | 6K | |
| LLM Evaluation Framework | 6K | |
| Convert scientific posters (PDF/images) to structured JSON metadata using Large ... | 6K | |
| OpenCompass is an LLM evaluation platform, supporting a wide range of models (Ll... | 5K | |
| The robust European language model benchmark. | 5K | |
| Simple, Pythonic building blocks to evaluate LLM applications. | 5K | |
| Dynamiq is an orchestration framework for agentic AI and LLM applications | 5K | |
| Holistic Evaluation of Language Models (HELM) is an open source Python framework... | 4K | |
| Build high-quality LLM apps - from prototyping, testing to production deployment... | 4K | |
| Advanced Machine Learning Training Platform - IN DEVELOPMENT | 3K | |
| Backend library for conversational AI in biomedicine | 3K | |
| A graph-based toolkit for evaluating LLM and RAG outputs with repeatable quality... | 3K | |
| HoloDeck - Experimentation-driven agent experimentation and deployment | 3K | |
| Automated Hyperparameter Optimization Platform for Efficient LLM Fine-Tuning | 3K | |
| We help GenAI teams maintain high-accuracy for their Models in production. | 3K | |
| ADyFT(Auto Dynamic Fine Tuning) automates parameter-efficient fine-tuning of Lar... | 3K | |
| We help GenAI teams maintain high-accuracy for their Models in production. | 2K | |
| Comprehensive NLP Evaluation System | 2K | |
| LangFair is a Python library for conducting use-case level LLM bias and fairness... | 2K | |
| Uncertainty Estimation Toolkit for Transformer Language Models | 2K | |
| TruthTorchLM is an open-source library designed to assess truthfulness in langua... | 2K | |
| Contain functions and classes to efficiently train a sequence to sequence to tra... | 2K | |
| FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensibl... | 1K | |
| Automate your RAG research. | 1K | |
| 1K | ||
| convert embedding vectors back to text | 1K | |
| RadEval: A framework for radiology text evaluation | 1K | |
| An evaluation framework for Serbian Whisper models. | 1K | |
| Korean AI Project | 1K |