PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
JudgmentLabs
judgeval

The open source post-building layer for agents. Our environment data and evals power agent post-training (RL, SFT) and monitoring.

330K 1K 91
modelscope
ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-R1, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, Phi4, ...) (AAAI 2025).

171K 14K 1K
hud-evals
hud-python

OSS RL environment + evals toolkit

100K 248 57
AjayBandiwaddar
learnlens-rl

Universal evaluation layer for OpenEnv agentic RL environments. Measures what an agent learned - not just how much reward it accumulated.

2K 1 0
AjayBandiwaddar
learnlens

Universal evaluation layer for OpenEnv agentic RL environments. Measures what an agent learned - not just how much reward it accumulated.

1K 1 0
sail-sg
oat-llm

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

788 652 64
teilomillet
textpolicy

Reinforcement learning for text generation on MLX (Apple Silicon)

469 14 3
warlockee
oxrl

A lightweight post-training framework for LLMs and VLMs

306 16 1
kyegomez
r-torch

An open source implementation of R1

279 31 5
toolbrain
toolbrain

A framework for agentic tool use training with reinforcement learning

212 167 15
opendilab
lightrft

Light, Efficient, Omni-modal & Reward-model Driven Reinforcement Fine-Tuning Framework

120 297 10
adeelahmad
mlx-guided-grpo

Train reasoning models on your Mac. GRPO training framework for Apple Silicon with curriculum learning.

86 1 0
hud-evals
genteki-hdp

OSS RL environment + evals toolkit

70 248 57
The-Swarm-Corporation
open-parl

PARL (Parallel-Agent Reinforcement Learning) - A training paradigm for coordinating multiple agents in parallel workflows

66 38 3
    • Data from PyPI, GitHub, ClickHouse, and BigQuery