PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
MakazhanAlpamys
soup-cli

Soup turns the pain of LLM fine-tuning into a simple workflow. One config, one command, done.

11K 53 7
altaidevorg
afterimage

Generate conversational, tool-calling, structured-output, and preference datasets — easily and at scale

3K 36 1
Goekdeniz-Guelmez
mlx-lm-lora

Train LLMs on Apple silicon with MLX and the Hugging Face Hub

3K 335 42
oumi-ai
oumi

Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!

2K 9K 760
dannylee1020
openpo

Build high quality synthetic datasets with AI feedback from 200+ LLMs

1K 27 0
sail-sg
oat-llm

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

788 652 64
armbues
sillm-mlx

Running and training LLMs on Apple Silicon via MLX

334 286 26
warlockee
oxrl

A lightweight post-training framework for LLMs and VLMs

306 16 1
liuxiaotong
knowlyr-sandbox

Gymnasium-style RL framework for LLM agent training — MDP environments, three-layer process reward & SFT/DPO/GRPO policy optimization. CLI + MCP ready.

252 3 0
liuxiaotong
knowlyr-hub

Gymnasium-style RL framework for LLM agent training — MDP environments, three-layer process reward & SFT/DPO/GRPO policy optimization. CLI + MCP ready.

246 3 0
liuxiaotong
knowlyr-core

Gymnasium-style RL framework for LLM agent training — MDP environments, three-layer process reward & SFT/DPO/GRPO policy optimization. CLI + MCP ready.

244 3 0
liuxiaotong
knowlyr-recorder

Gymnasium-style RL framework for LLM agent training — MDP environments, three-layer process reward & SFT/DPO/GRPO policy optimization. CLI + MCP ready.

240 3 0
TUDB-Labs
mlora-cli

The cli tools for mLoRA system.

228 376 66
liuxiaotong
knowlyr-reward

Gymnasium-style RL framework for LLM agent training — MDP environments, three-layer process reward & SFT/DPO/GRPO policy optimization. CLI + MCP ready.

222 3 0
toolbrain
toolbrain

A framework for agentic tool use training with reinforcement learning

212 167 15
ServiceNow
sygra

SyGra - Graph-oriented Synthetic data generation Pipeline

158 81 15
liuxiaotong
knowlyr-trainer

PyTorch-based trainer for Agent trajectory datasets — SFT, DPO, GRPO

114 3 0
li-plus
flash-pref

Accelerate LLM preference tuning via prefix sharing with a single line of code

108 52 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery