PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Data Augmentation Python Packages

Python packages with the GitHub topic data-augmentation. Sorted by relevance, with stars and monthly downloads.
webdataset
webdataset

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

2M 3K 232
asteroid-team
torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

1.7M 1K 100
akiomik
pilgram

A python library for instagram filters

224K 131 17
iver56
audiomentations

A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.

172K 2K 219
TorchIO-project
torchio

Medical imaging processing for AI applications.

90K 2K 262
NVIDIA
nvidia-dali-cuda120

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

82K 6K 663
snorkel-team
snorkel

A system for quickly generating training data with weak supervision

78K 6K 854
bethgelab
imagecorruptions

Python package to corrupt arbitrary images.

25K 470 75
iver56
numpy-audio-limiter

Simple audio limiter. Made for use in audiomentations.

23K 8 0
iver56
fast-mp3-augment

Fast Python library for MP3 audio data augmentation (encode + decode for intentional audio quality degradation). Made for use in audiomentations.

21K 6 0
QData
textattack

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/

17K 3K 445
albumentations-team
albumentationsx

Next-generation Albumentations: dual-licensed for open-source and commercial use

16K 315 28
zoj613
polyagamma

An efficient and flexible sampler of the Pólya-Gamma distribution with a NumPy/SciPy compatible interface.

14K 27 5
visualdatabase
fastdup

fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.

10K 2K 87
sparkfish
augraphy

Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes

6K 527 60
webdataset
wids

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

5K 3K 232
NVIDIA
nvidia-dali-nightly-cuda120

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

5K 6K 663
DeepTrackAI
deeptrack

DeepTrack2 is a modular Python library for generating, manipulating, and analyzing image data pipelines for machine learning and experimental imaging.

5K 236 61
gvtulder
elasticdeform

Differentiable elastic deformations for N-dimensional images (Python, SciPy, NumPy, TensorFlow, PyTorch).

4K 195 26
vkit-x
vkit-nightly

Boosting Document Intelligence

4K 23 1
NVIDIA
nvidia-dali-nightly-cuda110

NVIDIA DALI nightly for CUDA 11.0. Git SHA: 2be08c56f2be9ec8055256256039eb534ab7a080

4K 6K 663
NVIDIA
nvidia-dali-cuda110

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

3K 6K 663
NVIDIA
nvidia-dali-tf-plugin-nightly-cuda120

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

3K 6K 663
NVIDIA
nvidia-dali-cuda130

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

3K 6K 663
    • Data from PyPI, GitHub, ClickHouse, and BigQuery