PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
great-expectations
great-expectations

Always know what to expect from your data.

31.4M 11K 2K
databrickslabs
databricks-labs-dqx

Databricks framework to validate Data Quality of pySpark DataFrames and Tables

5.1M 405 111
ydataai
ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

1.9M 14K 2K
evidentlyai
evidently

Evidently is ​​an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.

1.2M 7K 836
treeverse
lakefs-sdk

lakeFS - Data version control for your data lake | Git for data

1.1M 5K 446
datafold
collate-data-diff

Compare tables within or across databases

934K 3K 305
treeverse
lakefs

lakeFS - Data version control for your data lake | Git for data

920K 5K 446
feast-dev
feast

The Open Source Feature Store for AI/ML

774K 7K 1K
ydataai
pandas-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

614K 14K 2K
great-expectations
great-expectations-experimental

Always know what to expect from your data.

573K 11K 2K
great-expectations
acryl-great-expectations

Always know what to expect from your data.

410K 11K 2K
open-metadata
openmetadata-ingestion

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

394K 14K 2K
dylan-profiler
tangled-up-in-unicode

Access to the Unicode Character Database (UCD)

369K 3 6
treeverse
lakefs-client

lakeFS - Data version control for your data lake | Git for data

211K 5K 446
great-expectations
airflow-provider-great-expectations

Great Expectations Airflow operator

191K 172 57
voxel51
fiftyone

Refine high-quality datasets and visual AI models

179K 11K 752
voxel51
fiftyone-db

Refine high-quality datasets and visual AI models

169K 11K 752
polyaxon
traceml

Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.

142K 530 47
polyaxon
datatile

Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.

113K 530 47
mouradmourafiq
pandas-summary

Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.

113K 530 47
canimus
cuallee

Possibly the fastest DataFrame-agnostic quality check library in town.

106K 243 22
cleanlab
cleanlab

Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

58K 11K 890
datafold
data-diff

Compare tables within or across databases

56K 3K 305
posit-dev
pointblank

Data validation toolkit for assessing and monitoring data quality.

52K 430 27
    • Data from PyPI, GitHub, ClickHouse, and BigQuery