PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
WenjieDu
tsdb

a Python toolbox loads 173 public time series datasets for machine/deep learning with a single line of code. Datasets from multiple domains including healthcare, financial, power, traffic, weather, and etc.

127K 234 23
WenjieDu
pypots

A Python toolkit/library for reality-centric machine/deep learning & data mining on partially-observed time series, with 50+ SOTA neural network models for scientific analysis tasks (imputation, classification, clustering, forecasting, anomaly detection, cleaning) on incomplete industrial irregularly-sampled multivariate TS with NaN missing values

122K 2K 184
WenjieDu
pygrinder

PyGrinder: a Python toolkit for grinding data beans into the incomplete for real-world data simulation by introducing missing values with different missingness patterns, including MCAR (complete at random), MAR (at random), MNAR (not at random), sub sequence missing, and block missing

122K 65 6
hammerlab
knnimpute

Python implementations of kNN imputation

88K 31 14
dvgodoy
handyspark

HandySpark - bringing pandas-like capabilities to Spark dataframes

12K 200 27
moment-timeseries-foundation-model
momentfm

MOMENT: A Family of Open Time-series Foundation Models, ICML'24

7K 756 107
UdayLab
geoanalytics

This software is being developed at the University of Aizu, Aizu-Wakamatsu, Fukushima, Japan

5K 5 35
david-cortes
isotree

(Python, R, C/C++) Isolation Forest and variations such as SCiForest and EIF, with some additions (outlier detection + similarity + NA imputation)

5K 230 40
eltonlaw
impyute

Cross-sectional and time-series data imputation algorithms

3K 361 49
eXascaleInfolab
imputegap

A Library of Imputation Techniques for Time Series Data

3K 64 14
Ashford-A
univi

UniVI is a scalable multi-modal VAE toolkit for aligning heterogeneous single-cell datasets into a shared latent space—supporting unimodal, dual-modal, and tri-modal (and beyond) integration. It can additionally be used for cross-modal imputation, data generation of biologically-relevant synthetic samples, data denoising, and structured evaluation.

1K 5 0
awslabs
datawig

Imputation of missing values in tables.

733 492 70
iyhaoo
disc

A highly scalable and accurate inference of gene expression and structure for single-cell transcriptomes using semi-supervised deep learning.

669 10 5
CyrilJl
datafiller

Data imputation

601 0 0
mcuntz
hesseflux

hesseflux: a Python library to process and post-process Eddy covariance data

417 11 4
DavideAltomare
rego

Automatic Time Series Forecasting and Missing Values Imputation

400 19 3
stonegor
ae-imputer

A python package used for missing data imputation via autoencoders.

286 2 0
WenjieDu
pycorruptor

PyGrinder: a Python toolkit for grinding data beans into the incomplete for real-world data simulation by introducing missing values with different missingness patterns, including MCAR (complete at random), MAR (at random), MNAR (not at random), sub sequence missing, and block missing

268 65 6
lemma-osu
sknnr

scikit-learn compatible estimators for various kNN imputation methods

243 1 1
macarro
imputena

Package that allows both automated and customized treatment of missing values in datasets using Python.

230 10 2
raamana
missingdata

missing data handing: visualize and impute

215 18 1
JoshWeiner
ml-impute

A package for synthetic data generation for imputation using single and multiple imputation methods.

209 4 0
WenjieDu
jackpots

A Python toolkit/library for reality-centric machine/deep learning & data mining on partially-observed time series, with 50+ SOTA neural network models for scientific analysis tasks (imputation, classification, clustering, forecasting, anomaly detection, cleaning) on incomplete industrial irregularly-sampled multivariate TS with NaN missing values

200 2K 184
calvinmccarter
utrees

Tabular data imputation and generation, with flexible modeling of quantitative features via hierarchical binning (TMLR, 2025)

193 16 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery