PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
lk-geimfari
mimesis

Mimesis is a fast Python library for generating fake data in multiple languages.

1.9M 5K 359
pgmpy
pgmpy

Python Toolkit for Causal and Probabilistic Reasoning

448K 3K 1K
databrickslabs
dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines

274K 460 93
sdv-dev
copulas

A library to model multivariate data using copulas.

203K 643 120
sdv-dev
sdmetrics

Metrics to evaluate quality and efficacy of synthetic datasets.

145K 258 52
sdv-dev
sdv

Synthetic data generation for tabular data

141K 3K 417
sdv-dev
ctgan

Conditional GAN for generating synthetic tabular data.

135K 2K 330
sdv-dev
deepecho

Synthetic Data Generation for mixed-type, multivariate time series.

118K 123 17
unrealcv
unrealcv

UnrealCV: Connecting Computer Vision to Unreal Engine

63K 2K 460
bespokelabsai
bespokelabs-curator

Synthetic data curation for post-training and structured data extraction

42K 2K 140
barseghyanartur
faker-file

Create files with fake data. In many formats. With no efforts.

15K 104 10
tdspora
syngen

Open-source version of the TDspora synthetic data generation algorithm.

14K 18 12
nickkunz
smogn

Synthetic Minority Over-Sampling Technique for Regression

13K 348 84
privateai
privateai-client

A python client used to interact with the Private AI's API

11K 23 3
gretelai
gretel-client

The Gretel Python Client allows you to interact with the Gretel REST API.

11K 63 20
ydataai
ydata-synthetic

Synthetic data generators for tabular and time-series data

9K 2K 260
opendsr-std
seedfaker

Deterministic synthetic data generator for realistic, correlated, and noisy test records across 68 locales. Rust CLI/Python/Node.js/Browser WASM/Go/PHP/Ruby/MCP

9K 23 0
mostly-ai
mostlyai

Synthetic Data SDK ✨

9K 769 64
avsolatorio
realtabformer

A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.

9K 243 30
gretelai
gretel-synthetics

Synthetic data generators for structured and unstructured text, featuring differentially private learning.

7K 677 100
mostly-ai
mostlyai-engine

Synthetic Data Engine 💎

7K 76 19
mostly-ai
mostlyai-qa

Synthetic Data Quality Assurance 🔎

7K 66 13
lightning-rod-labs
lightningrod-ai

Python SDK for dataset generation on LightningRod platform ⚡

6K 44 3
sparkfish
augraphy

Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes

6K 527 60
    • Data from PyPI, GitHub, ClickHouse, and BigQuery