PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Data Centric Ai Python Packages

Python packages with the GitHub topic data-centric-ai. Sorted by relevance, with stars and monthly downloads.
voxel51
fiftyone

Refine high-quality datasets and visual AI models

178K 11K 752
voxel51
fiftyone-db

Refine high-quality datasets and visual AI models

171K 11K 752
cleanlab
cleanlab

Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

58K 11K 890
cleanlab
cleanvision

Automatically find issues in image datasets and practice data-centric computer vision.

10K 1K 80
cleanlab
cleanlab-studio

Client interface to Cleanlab Studio

4K 31 10
voxel51
fiftyone-db-ubuntu2204

Refine high-quality datasets and visual AI models

3K 11K 752
Digital-Dermatology
selfclean

[NeurIPS 2024] πŸ§ΌπŸ”Ž A holistic self-supervised data cleaning strategy to detect irrelevant samples, near duplicates and label errors.

2K 37 2
voxel51
fiftyone-desktop

FiftyOne Desktop

1K 11K 752
aai-institute
pydvl

The Python Data Valuation Library

696 145 10
opendataval
opendataval

OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)

533 100 11
voxel51
fiftyone-db-ubuntu2004

Refine high-quality datasets and visual AI models

512 11K 752
cleanlab
cleanlab-cli

Client interface to Cleanlab Studio

501 31 10
mdbloice
labeller

Quickly set up an image labelling web application for manually tagging images for machine learning tasks.

457 9 2
Hyper3Labs
hyperview

HyperView curates datasets and provides model introspection in hyperbolic and Euclidean geometries.

356 17 2
cleanlab
example-package-elisno

The standard package for data-centric AI, machine learning with label errors, and automatically finding and fixing dataset issues in Python.

277 11K 890
Docta-ai
docta-ai

Docta.ai

215 3K 256
ear-team
bambird

Unsupervised classification to improve the quality of a bird song recording dataset. https://doi.org/10.1016/j.ecoinf.2022.101952

212 31 7
code-kern-ai
kern-refinery

The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.

201 1K 73
voxel51
fiftyone-db-debian9

FiftyOne DB

166 11K 752
code-kern-ai
refinery-python-sdk

Official Python SDK for Kern AI refinery.

158 20 3
JieyuZ2
ws-benchmark

[NeurIPS 2021] WRENCH: Weak supeRvision bENCHmark

154 227 34
voxel51
fiftyone-db-ubuntu1604

Project FiftyOne database

131 11K 752
voxel51
fiftyone-db-rhel7

Refine high-quality datasets and visual AI models

121 11K 752
code-kern-ai
kern-python-client

Official Python SDK for Kern AI refinery.

1 20 3
    • Data from PyPI, GitHub, ClickHouse, and BigQuery