280 dependents
| Package | Description | Downloads/month |
|---|---|---|
| A framework for few-shot evaluation of language models. | 1.4M | |
| Fast and Accurate ML in 3 Lines of Code | 859K | |
| Efficient few-shot learning with Sentence Transformers | 214K | |
| Therapeutics Commons (TDC): Multimodal Foundation for Therapeutic Science | 127K | |
| Interact with the Databricks Generative AI APIs in python | 71K | |
| AMD Quark is a comprehensive cross-platform toolkit designed to simplify and enh... | 65K | |
| 36K | ||
| This is an unofficial, use-at-your-own risks port of the webarena benchmark, for... | 34K | |
| ๐ค AutoTrain Advanced | 34K | |
| ๐ฆ Unitxt is a Python library for enterprise-grade evaluation of AI performance, ... | 33K | |
| This is an unofficial, use-at-your-own risks port of the visualwebarena benchmar... | 33K | |
| Go ahead and axolotl questions | 20K | |
| One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio T... | 16K | |
| TensorRT LLM provides users with an easy-to-use Python API to define Large Langu... | 16K | |
| SpanMarker for Named Entity Recognition | 15K | |
| A framework for evaluating language models - packaged by NVIDIA | 14K | |
| The robust European language model benchmark. | 13K | |
| Python SDK for Galileo's NLP and CV Studio. | 11K | |
| OpenCompass is an LLM evaluation platform, supporting a wide range of models (LL... | 10K | |
| 9K | ||
| AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evalu... | 9K | |
| Benchmark evaluators split from wisent | 8K | |
| Build on large language models faster | 8K | |
| OpenSportsLib is the professional library, designed for advanced video understan... | 7K | |
| My digital memory | 6K | |
| An open-source evaluation framework for voice agents | 6K | |
| LLM Evaluation Framework | 6K | |
| 6K | ||
| OpenCompass is an LLM evaluation platform, supporting a wide range of models (Ll... | 5K | |
| DAG-guided discrete diffusion language models for reasoning | 5K | |
| The robust European language model benchmark. | 5K | |
| Text metrics | 4K | |
| ConvoKit is a toolkit for extracting conversational features and analyzing socia... | 4K | |
| Brando's Ultimate Utils for Science, Machine Learning, and AI | 4K | |
| Genomic Language Model Mitigates Chimera Artifacts in Nanopore Direct RNA Sequen... | 4K | |
| UniMERNet: A Universal Network for Real-World Mathematical Expression Recognitio... | 3K | |
| Advanced Machine Learning Training Platform - IN DEVELOPMENT | 3K | |
| DashAI: an interactive platform for training, evaluating and deploying AI models | 3K | |
| Embeddings: State-of-the-art Text Representations for Natural Language Processin... | 3K | |
| Evals is a framework for evaluating LLMs and LLM systems, and an open-source reg... | 3K | |
| Backend library for conversational AI in biomedicine | 3K | |
| 3K | ||
| HoloDeck - Experimentation-driven agent experimentation and deployment | 3K | |
| Modern Data Centric AI system for Large Language Models | 3K | |
| An open-source NLP framework that offers high-level wrappers designed for effort... | 2K | |
| Comprehensive NLP Evaluation System | 2K | |
| A Python package for audio analysis and machine learning-based audio classificat... | 2K | |
| A single model for all your molecular design tasks | 2K | |
| a iflytek ailab library ... | 2K | |
| Automated Evaluation of RAG Systems | 2K |