PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
Blaizzy
mlx-vlm

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

349K 5K 506
Blaizzy
mlx-audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

78K 7K 578
filipstrand
mflux

MLX native implementations of state-of-the-art generative image models

40K 2K 141
cubist38
mlx-openai-server

A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Developed using Python and powered by the FastAPI framework, it provides an efficient, scalable, and user-friendly solution for running MLX-based vision and language models locally with an OpenAI-compatible interface.

22K 325 58
raullenchai
rapid-mlx

The fastest local AI engine for Apple Silicon. 4.2x faster than Ollama, 0.08s cached TTFT, 100% tool calling. 17 tool parsers, prompt cache, reasoning separation, cloud routing. Drop-in OpenAI replacement. Works with Claude Code, Cursor, Aider.

18K 635 74
ARahim3
mlx-tune

Fine-tune LLMs on your Mac with Apple Silicon. SFT, DPO, GRPO, Vision, TTS, STT, Embedding, and OCR fine-tuning — natively on MLX. Unsloth-compatible API.

14K 1K 79
jjang-ai
jang

JANG — GGUF for MLX. YOU MUST USE JANG_Q RUNTIME. Adaptive Mixed-Precision Quantization + Runtime for Apple Silicon

9K 142 20
tlkh
asitop

Perf monitoring CLI tool for Apple Silicon

8K 5K 205
wst24365888
libstreamvbyte

A C++ implementation of StreamVByte, with Python bindings.

7K 10 1
tanavc1
llm-autotune

Zero-config local LLM optimization for Ollama, LM Studio, and Apple Silicon MLX. Reduces TTFT by 40%, wall time for local agents by 46%, and RAM usage by 3x.

7K 24 1
wnsgus00114-droid
lightning-core

Lightning Core: macOS-first CUDA-style runtime with Metal backend

7K 1 0
hellobertrand
zxc-compress

High-performance asymmetric lossless compression. 40%+ faster decompression than LZ4 on ARM64 with better compression ratios. Optimized for Game Assets, Firmware & App Bundles.

6K 333 7
arizawan
vidlizer

Point it at a video, image, or PDF — get structured JSON. uvx vidlizer[mcp]. Runs local (Ollama/gemma4, LM Studio, oMLX) or cloud (OpenRouter). CLI + MCP server for Claude Code, Cursor, and Claude Desktop.

5K 1 0
tillahoffmann
jax-mps

A JAX backend for Apple Metal Performance Shaders (MPS), enabling GPU-accelerated JAX computations on Apple Silicon.

5K 122 13
simonsysun
seeklink

SeekLink — hybrid semantic search for markdown vaults. Four-channel RRF fusion, MLX reranker, native CJK support. Fully local.

3K 6 0
manjunathshiva
turboquant-mlx-full

Extreme weight and KV cache compression for LLMs on Apple Silicon (MLX implementation of Google's TurboQuant)

3K 8 1
DarshanFofadiya
sparsecore

Actually-sparse dynamic training for PyTorch. CPU-native, Apple Silicon first. Pluggable routers, drop-in SparseLinear.

3K 8 2
geeks-accelerator
ollama-herd

Local AI load balancer for Ollama fleets — auto-discovery, smart routing, OpenAI-compatible API, zero config. Perfect for Mac Minis & Studios.

3K 7 0
vikranthreddimasu
macfleet

Pool Apple Silicon Macs for distributed compute and ML training

2K 0 0
mordechaipotash
brain-mcp

Your AI has amnesia. Persistent memory and cognitive context for AI. 25 MCP tools. 12ms recall.

2K 44 13
dualform-labs
m5-infer

Extraordinary speed, extraordinary quality — an MLX-based inference engine for Apple Silicon.

2K 0 1
weklund
mlx-stack

CLI control plane for local LLM infrastructure on Apple Silicon

2K 4 0
AlphaWaveSystems
tqai

TurboQuant KV cache compression for local LLM inference

2K 1 0
szibis
mlx-flash

Run AI models too large for your Mac's memory — expert caching, speculative execution, and 15+ research techniques for MoE inference on Apple Silicon

1K 2 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery