PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
open-mmlab
mmdet

OpenMMLab Detection Toolbox and Benchmark

430K 33K 10K
Blaizzy
mlx-vlm

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

349K 5K 506
open-mmlab
mmcls

OpenMMLab Pre-training Toolbox and Benchmark

54K 4K 1K
open-mmlab
mmpretrain

OpenMMLab Pre-training Toolbox and Benchmark

22K 4K 1K
lukas-blecher
pix2tex

pix2tex: Using a ViT to convert images of equations into LaTeX code.

11K 16K 1K
emcf
thepipe-api

Get clean data from tricky documents, powered by vision-language models ⚡

3K 2K 99
towhee-io
towhee

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

2K 3K 261
NVlabs
mambavision

[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone

2K 2K 139
NVlabs
fastervit

FasterViT: Fast Vision Transformers with Hierarchical Attention

2K 914 69
lygitdata
garmentiq

Free & Open Source. Precise and flexible garment measurements from images - no tape measures, no delays, just fashion - forward automation.

1K 20 4
alibaba
pai-easycv

An all-in-one toolkit for computer vision

1K 2K 225
veb-101
attention-and-transformers

Transformers goes brrr... Attention and Transformers from scratch in TensorFlow. Currently contains Vision transformers, MobileViT-v1, MobileViT-v2, MobileViT-v3

1K 14 2
kyegomez
clipq

A simple implementation of a CLIP that splits up an image into quandrants and then gets the embeddings for each quandrant

1K 7 1
towhee-io
towhee-models

Towhee is a framework that helps you encode your unstructured data into embeddings.

975 3K 261
mit-han-lab
efficientvit-gml

open-set object detector

974 3K 240
sovit-123
vision-transformers

Vision Transformers for image classification, image segmentation, and object detection.

825 67 9
martinsbruveris
tfimm

TensorFlow port of PyTorch Image Models (timm) - image models with pretrained weights.

766 291 25
fmegahed
conformal-clip

Few-shot CLIP classification with conformal prediction, probability calibration, and reliability metrics.

736 0 0
basaanithanaveenkumar
haloblocks

Python library designed to make model experimentation seamless and fast. The goal was simple: treat every component (attention heads, MLPs, MoE layers) as a plug-and-play block so you can focus on the architecture, not the boilerplate.

572 5 0
evanatyourservice
image-classification-jax

Image classification in JAX with ViT, resnet, cifar10, cifar100, imagenette, and imagenet

555 3 0
DavidLandup0
deepvision-toolkit

PyTorch and TensorFlow/Keras image models with automatic weight conversions and equal API/implementations - Vision Transformer (ViT), ResNetV2, EfficientNetV2, NeRF, SegFormer, MixTransformer, (planned...) DeepLabV3+, ConvNeXtV2, YOLO, etc.

523 42 7
zhongkaifu
seq2seqsharp

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, MacOS), multimodal model for text and images and so on.

488 211 43
open-mmlab
mmdet-taeuk4958

OpenMMLab Detection Toolbox and Benchmark

455 33K 10K
TheoCoombes
clipcap

Using pretrained encoder and language models to generate captions from multimedia inputs.

447 100 14
    • Data from PyPI, GitHub, ClickHouse, and BigQuery