PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
mahmoud
glom

☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️

15.8M 2K 72
daq-tools
commons-codec

Data decoding, encoding, conversion, and translation utilities.

9K 2 2
bruin-data
bruin-sdk

Bruin Python SDK — eliminate boilerplate in Bruin Python assets

6K 5 0
ironmussa
optimuspyspark

:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

6K 2K 232
panodata
tikray

A compact data transformation engine.

5K 1 0
kmatarese
glide

Easy ETL

5K 17 2
productml
blurr-dev

Data aggregation pipeline for running real-time predictive models

4K 4 0
azukds
tubular

Python package implementing ML feature engineering and pre-processing for polars or pandas dataframes.

3K 100 27
scottroberts140
dsr-feature-eng-ml

Machine learning model evaluation and feature engineering framework with hyperparameter tuning, data balancing, and feature importance analysis.

2K 1 0
Org-EthereaLogic
etherealogic-aetheriaforge

Databricks-native intelligent data transformation engine — coherence-scored Bronze/Silver/Gold with entity resolution and temporal reconciliation in a single deployable product.

2K 1 0
benzsevern
goldenflow

Data transformation toolkit — standardize, reshape, and normalize messy data. Python & TypeScript. 83 transforms, zero-config mode, MCP server, edge-safe. DQBench 100/100.

1K 1 0
globaldothealth
adtl

Another data transformation language

1K 2 1
jhd3197
tukuy

A flexible data transformation library with a plugin system

1K 3 0
hi-primus
pyoptimus

:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

1K 2K 232
Cydra-Tech
smelt-ai

LLM-powered structured data transformation. Batch process rows through any LLM, get back strictly typed Pydantic models.

965 2 0
MatheusGiacomo
dataforge-dfg

Data Forge is a high-performance, CLI-first data integration tool designed to streamline the lifecycle of data from ingestion to transformation. Built with Python, it provides a robust framework for handling both ETL and ELT workflows with a focus on automation, reliability, and developer experience.

753 1 0
ityutin
df-and-order

Using df-and-order your interactions with dataframes become very clean and predictable.

398 3 2
mikeAdamss
tidychef

Python framework for transforming tabulated data with visual relationships into tidy data

375 1 1
artemlops
customer-segmentation-toolkit

Data transformations for the Engineering Lab2 Feature-Store-for-ML

359 1 0
productml
blurr

Data aggregation pipeline for running real-time predictive models

353 4 0
enram
vptstools

Python library to transfer and convert vertical profile time series data

314 4 1
brotherzhafif
pythistic

Frequency Table Conversion, Descriptive Statistics and Data Transformation Calculation Tool in Python

194 3 0
amadou-6e
pymdt2json

pymdt2json is a Python CLI and library for converting markdown tables into structured JSON: ideal for data pipelines, LLM preprocessing, and web/API integration.

183 1 0
bagher
fast-resource

fast-resource is a data transformation layer that sits between the database and the application's users, enabling quick data retrieval. It further enhances performance by caching data using Redis and Memcached.

119 10 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery