PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Synthetic Dataset Generation Python Packages

Python packages with the GitHub topic synthetic-dataset-generation. Sorted by relevance, with stars and monthly downloads.
openlayer-ai
openlayer

The official Python library for Openlayer, the Continuous Model Improvement Platform for AI. 📈

182K 16 2
bespokelabsai
bespokelabs-curator

Synthetic data curation for post-training and structured data extraction

46K 2K 140
avsolatorio
realtabformer

A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.

9K 243 30
Red-Hat-AI-Innovation-Team
sdg-hub

Synthetic Data Generation Toolkit for LLMs

9K 132 53
sparkfish
augraphy

Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes

6K 527 60
inductiva
inductiva

Large scale simulations made simple.

6K 41 8
tabularis-ai
be-great

A novel approach for synthesizing tabular data using pretrained large language models

5K 355 59
firstbatchxyz
dria

Dria SDK is for building and executing synthetic data generation pipelines on Dria Knowledge Network.

4K 28 8
kontextox
datasety

CLI tool for dataset preparation: resize, align, caption, shuffle, synthetic, and mask generation.

3K 2 0
rasinmuhammed
misata

Python synthetic data generator for realistic multi-table test data, database seeding, and scenario simulation

2K 54 3
datadreamer-dev
datadreamer-dev

DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models.   🤖💤

2K 1K 59
AlejandroBeldaFernandez
calm-data-generator

CALM-Data-Generator is a comprehensive Python library for synthetic data generation with advanced features

2K 4 1
SidharthMacherla
unreal

A Synthetic data generator.

1K 1 1
RajeevAtla
supercongan

Training a GAN using superconductivity data

1K 5 0
Clearbox-AI
clearbox-synthetic-kit

Clearbox AI's all-in-one solution for generation and evaluation of synthetic tabular and time-series data.

1K 44 1
georgeoshardo
symbac

Accurate segmentation of bacterial microscope images using deep learning synthetically generated image data.

887 22 11
clugen
pyclugen

Multidimensional cluster generation in Python

724 10 0
alfurka
synloc

A Python package to create synthetic data from locally estimated distributions

715 3 0
diffix
syndiffix

Python implementation of the SynDiffix synthetic data generation mechanism.

440 12 2
AmanPriyanshu
dpsdv

Creating a Differential Privacy securing Synthetic Data Generation for tabular, relational and time series data.

420 9 2
laurawpaaby
educhateval

A pipeline and package to implement and evaluate LLM chat bot tutors in education.

413 1 0
iteal
wormpose

WormPose: Image synthesis and convolutional networks for pose estimation in C. elegans

406 55 19
Mohammed-Almekhlafi
historical2realtime

Transform static, tabular historical data into a live, real-time synthetic data stream.

399 1 0
unboxai
unboxapi

The official Python library for Openlayer, the Continuous Model Improvement Platform for AI. 📈

387 16 2
    • Data from PyPI, GitHub, ClickHouse, and BigQuery