Multi Modal Python Packages

modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

4M 9K 934

agentscope

Build and run agents you can see, understand and trust.

202K 25K 3K

docarray

Represent, send, store and search multimodal data

144K 3K 241

pyrhubarb

A Python framework for multi-modal document understanding with Amazon Bedrock

44K 103 14

medmnist

[pip install medmnist] 18x Standardized Datasets for 2D and 3D Biomedical Image Classification

39K 1K 207

pyvalhalla

Open Source Routing Engine for OpenStreetMap

6K 6K 888

dalle-pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

6K 6K 643

byaldi

Use late-interaction multi-modal models such as ColPali in just a few lines of code.

5K 847 93

cn-clip

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

4K 6K 552

pyvalhalla-weekly

Open Source Routing Engine for OpenStreetMap

4K 6K 888

qwen

My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't released model code yet sooo...

4K 12 2

py-data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

4K 6K 368

transfusion-pytorch

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

4K 1K 71

deepke

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction

3K 4K 742

brainles-preprocessing

preprocessing tools for multi-modal 3D brain imaging

2K 31 8

avitai-artifex

A research-focused modular generative modeling library built on JAX/Flax NNX

1K 1 0

vision-llama

Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta

889 16 0

switch-transformers

Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"

755 139 17

move-dl

Multi-omics variational autoencoder

655 94 33

mm-poe

Multiple Choice Reasoning via. Process of Elimination using Multi-Modal Models

512 1 1

hsss

Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling"

385 15 2

kosmosx

Transformers at zeta scales

369 70 11

rt2

Democratization of RT-2 "RT-2: New model translates vision and language into action"

333 566 68

mm1-torch

MM1 - Pytorch

325 27 1

Search Packages