135 dependents
Package Description Downloads/month
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VL... 349K
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library ... 78K
An open experiment: does developer sentiment with Claude Code vary by time of da... 72K
vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers M... 36K
A high-performance API server that provides OpenAI-compatible endpoints for MLX ... 22K
The fastest local AI engine for Apple Silicon. 4.2x faster than Ollama, 0.08s ca... 18K
Fine-tune LLMs on your Mac with Apple Silicon. SFT, DPO, GRPO, Vision, TTS, STT,... 14K
Optimizing inference proxy for LLMs 12K
vLLM-like inference for Apple Silicon - GPU-accelerated Text, Image, Video & Aud... 10K
Core API and plugin system for LEANN 7K
Mixed-precision quantization optimizer for MLX models on Apple Silicon 6K
Persistent conversational memory for AI coding assistants 5K
AlienSky optimized inference engine for Qwen on Apple Silicon 4K
Asynchronous Self-Healing KV Cache for Silicon-Native LLMs by GDI Nexus 3K
MLX Omni Server is a local inference server powered by Apple's MLX framework, sp... 3K
This package implements all the logic of Brief My Press.AI 3K
Lossless DFlash speculative decoding for MLX on Apple Silicon 3K
Extreme weight and KV cache compression for LLMs on Apple Silicon (MLX implement... 3K
Train LLMs on Apple silicon with MLX and the Hugging Face Hub 3K
Coding Agent for Mac 3K
Z-Vision Generator — cross-platform AI image and video generator 2K
Qwen3 CustomVoice 命令行工具 2K
This is LLM interface library. 2K
Unified MLX server & CLI (language and vision) with OpenAI-compatible endpoints 2K
Extraordinary speed, extraordinary quality — an MLX-based inference engine for A... 2K
HuggingFace model management for MLX on Apple Silicon 2K
Run AI models too large for your Mac's memory — expert caching, speculative exec... 1K
Standalone MLX-based LLM inference service with OpenAI compatible API 1K
Python tools for text to speech (TTS), speech to text (STT), and speech to speec... 1K
Run local LLMs from Python. LangChain-compatible. llama.cpp + MLX backends. 1K
3 AI models. 161B parameters. One Mac. 5.5GB. Full agentic pipeline on Apple Sil... 1K
An optimized MLX (Apple Silicon Metal) Server for running local LLMs with higher... 1K
Offline meeting recorder & summarizer for macOS 1K
Support for MLX models in LLM 954
dora-qwen 902
GPU-Accelerated LLM Terminal for Apple Silicon 886
Mechanistic interpretability on Apple Silicon: steering vectors, residual captur... 862
Pure MLX port of Baidu ERNIE-Image (8B text-to-image DiT) for Apple Silicon infe... 841
GenAI & agent toolkit for Apple Silicon Mac, implementing JSON schema-steered st... 830
ASIMOV MLX Module 787
VAD-driven streaming voice dictation for macOS — local Whisper ASR + Silero VAD ... 683
Train Embedding Models on MLX. 678
vLLM hardware plugin for Apple Silicon - unifies MLX and PyTorch under a single ... 638
A Retrieval-augmented Generation (RAG) chat interface with support for multiple ... 624
Hanzo Network - Distributed AI compute network for running models locally and re... 621
MLX integration for the Agent Framework 565
LittleHive local-first multi-agent assistant foundation 552
For inferring and serving local LLMs using the MLX framework 543
A comprehensive toolkit for end-to-end continued pre-training, fine-tuning, moni... 530
llama-index llms mlx integration 515