PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
great-expectations
great-expectations

Always know what to expect from your data.

31.4M 11K 2K
datafold
collate-data-diff

Compare tables within or across databases

934K 3K 305
great-expectations
great-expectations-experimental

Always know what to expect from your data.

573K 11K 2K
great-expectations
acryl-great-expectations

Always know what to expect from your data.

410K 11K 2K
open-metadata
openmetadata-ingestion

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

394K 14K 2K
AltimateAI
altimate-datapilot-cli

Datailot-cli is the command line interface for accessing the AI teammate for engineers to ensure best practices in their SQL and dbt projects.

116K 40 1
canimus
cuallee

Possibly the fastest DataFrame-agnostic quality check library in town.

106K 243 22
datafold
data-diff

Compare tables within or across databases

56K 3K 305
open-metadata
openmetadata-managed-apis

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

39K 14K 2K
AutoViML
pandas-dq

Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.

14K 137 15
IBM
lale

Library for Semi-Automated Data Science

10K 346 83
re-data
re-data

re_data - fix data issues before your users & CEO would discover them 😊

6K 2K 125
zinggAI
zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

4K 1K 165
jabardigitalservice
datasae

Data Quality Framework provides by Jabar Digital Service

4K 5 1
DataKitchen
dataops-testgen

DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling,  new dataset hygiene review, AI generation of data quality validation tests, ongoing testing of data refreshes, & continuous anomaly monitoring

3K 73 6
open-metadata
openmetadata-airflow-managed-apis

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

3K 14K 2K
dima-ischenko
xoverrr

Data quality library

2K 3 2
MigoXLab
dingo-python

Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool

2K 691 71
open-metadata
openmetadata-ingestion-core

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

1K 14K 2K
datachecks
dcs-core

Open Source Data Quality Monitoring.

1K 170 23
andrjas
data-check

data and pipeline testing with and for SQL

879 5 0
nilotpaldhar2004
datadiagnose

Dataset Auto-Diagnosis Python Library — detect and fix data quality issues (leakage, skewness, outliers, imbalance) before model training.

810 1 0
Data-Culpa
dataculpa-client

Open source clients for working with Data Culpa Validator services from data pipelines

786 9 1
Delpha-Assistant
delpha-mcp

Delpha Data Quality MCP Server: Data quality assessment for MCP-compatible tools.

700 2 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery