PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Data Exploration Python Packages

Python packages with the GitHub topic data-exploration. Sorted by relevance, with stars and monthly downloads.
ydataai
ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

1.9M 14K 2K
ydataai
pandas-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

603K 14K 2K
Kanaries
pygwalker

PyGWalker: Turn your dataframe into an interactive UI for visual analysis

331K 16K 865
polyaxon
traceml

Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.

139K 530 47
fbdesignpro
sweetviz

Visualize and compare datasets, target values and associations, with one line of code.

123K 3K 288
mouradmourafiq
pandas-summary

Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.

111K 530 47
polyaxon
datatile

Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.

111K 530 47
InfuseAI
piperider-nightly

Code review for data in dbt

22K 494 23
panel-extensions
panel-graphic-walker

A project providing a Graphic Walker Pane for use with HoloViz Panel.

14K 352 14
sfu-db
dataprep

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.

11K 2K 221
cleanlab
cleanvision

Automatically find issues in image datasets and practice data-centric computer vision.

10K 1K 80
copyleftdev
x12-edi-tools

A comprehensive set of tools for working with X12 EDI files

6K 25 6
ironmussa
optimuspyspark

:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

6K 2K 232
InfuseAI
piperider

Code review for data in dbt

5K 494 23
Data-Centric-AI-Community
fg-data-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

5K 14K 2K
desbordante
desbordante

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

3K 477 99
debiai
debiai-gui

Bias detection and contextual evaluation tool for your AI projects

3K 30 5
abhayspawar
featexp

Feature exploration for supervised learning

3K 759 159
comet-ml
kangas

🦘 Explore multimedia datasets at scale

2K 1K 50
grafana-toolbox
grafana-wtf

Grep through all Grafana entities in the spirit of git-wtf.

2K 219 22
Renumics
sliceguard

A library for detecting problematic data segments in structured and unstructured data with few lines of code.

1K 63 3
tvdboom
atom-ml

Automated Tool for Optimized Modelling

1K 164 15
hi-primus
pyoptimus

:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

1K 2K 232
eikevons
pandas-paddles

Access the parent Pandas data frame in loc[], iloc[], assign(), and others Pandas helpers

1K 5 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery