Dependents of datasets

2,210 dependents

Package	Description	Downloads/month
sglang	SGLang is a high-performance serving framework for large language models and mul...	287.7M
swebench	SWE-bench: Can Language Models Resolve Real-world Github Issues?	38.3M
trl	Train transformer language models with reinforcement learning.	3.8M
mteb	MTEB: Massive Text Embedding Benchmark	2.7M
verifiers	Our library for RL environments + evals	2.3M
mini-swe-agent	The 100 line AI agent that solves GitHub issues or helps you in your command lin...	2M
unsloth	Web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt...	1.9M
unsloth-zoo	Web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt...	1.4M
lm-eval	A framework for few-shot evaluation of language models.	1.4M
ragas	Supercharge Your LLM Application Evaluations 🚀	1.3M
pyspark-huggingface	A DataSource for reading and writing HuggingFace Datasets in Spark	1M
harbor	A framework for evaluating and optimizing agents and models using sandboxed envi...	775K
pyiqa	🔎 🖼️ 🔥PyTorch Toolbox for Image Quality Assessment, including PSNR, SSIM, LPIPS,...	487K
flagembedding	Retrieval and Retrieval-augmented LLMs	425K
torchtune	PyTorch native post-training library	405K
ossdata	Scalable SWE datasets	356K
mlx-vlm	MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VL...	349K
aiperf	AIPerf is a package for performance testing of AI models	285K
llmcompressor	Transformers-compatible library for applying various compression algorithms to L...	285K
argilla	The Argilla python server SDK	277K
colbert-ai	ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22...	233K
setfit	Efficient few-shot learning with Sentence Transformers	214K
lerobot	🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning	204K
ms-swift	Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-R1, ...	171K
wisent	This is an open-source version of the representation engineering framework for s...	167K
reward-kit	A Python library for defining, testing, and using reward functions	144K
pytdc	Therapeutics Commons (TDC): Multimodal Foundation for Therapeutic Science	127K
voxcpm	VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice D...	122K
autoawq	AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup du...	122K
verl	verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework	122K
f5-tts	Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech wi...	104K
ghostos	A framework offers an OS simulator within a Python Code Interface for AI Agents	96K
inspect-evals	Collection of evals for Inspect AI	96K
sae-lens	Training Sparse Autoencoders on Language Models	92K
transformer-lens	An implementation of transformers tailored for mechanistic interpretability.	89K
pyrit	The Python Risk Identification Tool for LLMs (PyRIT) is a library used to assess...	77K
auto-gptq	An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ ...	75K
garak	the LLM vulnerability scanner	73K
tinker-cookbook	Post-training with Tinker	73K
simpletransformers	Transformers for Information Retrieval, Text Classification, NER, QA, Language M...	73K
pylate	Late Interaction Models Training & Retrieval	71K
databricks-genai	Interact with the Databricks Generative AI APIs in python	71K
auto-round	A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessl...	71K
google-tunix	A Lightweight LLM Post-Training Library	68K
tmnt	Topic modeling neural toolkit	63K
opik-optimizer	Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic wor...	60K
dreadnode	Dreadnode Strikes SDK	59K
torchtitan	A PyTorch native platform for training generative AI models	54K
pyserini	Pyserini is a Python toolkit for reproducible information retrieval research wit...	50K
axolotl-contribs-lgpl	LGPL contributions to the axolotl framework	49K