135 dependents
| Package | Description | Downloads/month |
|---|---|---|
| MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VL... | 349K | |
| A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library ... | 78K | |
| An open experiment: does developer sentiment with Claude Code vary by time of da... | 72K | |
| vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers M... | 36K | |
| A high-performance API server that provides OpenAI-compatible endpoints for MLX ... | 22K | |
| The fastest local AI engine for Apple Silicon. 4.2x faster than Ollama, 0.08s ca... | 18K | |
| Fine-tune LLMs on your Mac with Apple Silicon. SFT, DPO, GRPO, Vision, TTS, STT,... | 14K | |
| Optimizing inference proxy for LLMs | 12K | |
| vLLM-like inference for Apple Silicon - GPU-accelerated Text, Image, Video & Aud... | 10K | |
| Core API and plugin system for LEANN | 7K | |
| Mixed-precision quantization optimizer for MLX models on Apple Silicon | 6K | |
| Persistent conversational memory for AI coding assistants | 5K | |
| AlienSky optimized inference engine for Qwen on Apple Silicon | 4K | |
| Asynchronous Self-Healing KV Cache for Silicon-Native LLMs by GDI Nexus | 3K | |
| MLX Omni Server is a local inference server powered by Apple's MLX framework, sp... | 3K | |
| This package implements all the logic of Brief My Press.AI | 3K | |
| Lossless DFlash speculative decoding for MLX on Apple Silicon | 3K | |
| Extreme weight and KV cache compression for LLMs on Apple Silicon (MLX implement... | 3K | |
| Train LLMs on Apple silicon with MLX and the Hugging Face Hub | 3K | |
| Coding Agent for Mac | 3K | |
| Z-Vision Generator — cross-platform AI image and video generator | 2K | |
| Qwen3 CustomVoice 命令行工具 | 2K | |
| This is LLM interface library. | 2K | |
| Unified MLX server & CLI (language and vision) with OpenAI-compatible endpoints | 2K | |
| Extraordinary speed, extraordinary quality — an MLX-based inference engine for A... | 2K | |
| HuggingFace model management for MLX on Apple Silicon | 2K | |
| Run AI models too large for your Mac's memory — expert caching, speculative exec... | 1K | |
| Standalone MLX-based LLM inference service with OpenAI compatible API | 1K | |
| Python tools for text to speech (TTS), speech to text (STT), and speech to speec... | 1K | |
| Run local LLMs from Python. LangChain-compatible. llama.cpp + MLX backends. | 1K | |
| 3 AI models. 161B parameters. One Mac. 5.5GB. Full agentic pipeline on Apple Sil... | 1K | |
| An optimized MLX (Apple Silicon Metal) Server for running local LLMs with higher... | 1K | |
| Offline meeting recorder & summarizer for macOS | 1K | |
| Support for MLX models in LLM | 954 | |
| dora-qwen | 902 | |
| GPU-Accelerated LLM Terminal for Apple Silicon | 886 | |
| Mechanistic interpretability on Apple Silicon: steering vectors, residual captur... | 862 | |
| Pure MLX port of Baidu ERNIE-Image (8B text-to-image DiT) for Apple Silicon infe... | 841 | |
| GenAI & agent toolkit for Apple Silicon Mac, implementing JSON schema-steered st... | 830 | |
| ASIMOV MLX Module | 787 | |
| VAD-driven streaming voice dictation for macOS — local Whisper ASR + Silero VAD ... | 683 | |
| Train Embedding Models on MLX. | 678 | |
| vLLM hardware plugin for Apple Silicon - unifies MLX and PyTorch under a single ... | 638 | |
| A Retrieval-augmented Generation (RAG) chat interface with support for multiple ... | 624 | |
| Hanzo Network - Distributed AI compute network for running models locally and re... | 621 | |
| MLX integration for the Agent Framework | 565 | |
| LittleHive local-first multi-agent assistant foundation | 552 | |
| For inferring and serving local LLMs using the MLX framework | 543 | |
| A comprehensive toolkit for end-to-end continued pre-training, fine-tuning, moni... | 530 | |
| llama-index llms mlx integration | 515 |