PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Structured Data Python Packages

Python packages with the GitHub topic structured-data. Sorted by relevance, with stars and monthly downloads.
promplate
partial-json-parser

Parse partial JSON generated by LLM

6.7M 127 9
autogluon
autogluon-core

Fast and Accurate ML in 3 Lines of Code

1.3M 10K 1K
autogluon
autogluon-features

Fast and Accurate ML in 3 Lines of Code

1.2M 10K 1K
autogluon
autogluon-tabular

Fast and Accurate ML in 3 Lines of Code

1.2M 10K 1K
autogluon
autogluon-common

Fast and Accurate ML in 3 Lines of Code

1.1M 10K 1K
autogluon
autogluon

Fast and Accurate ML in 3 Lines of Code

994K 10K 1K
autogluon
autogluon-timeseries

Fast and Accurate ML in 3 Lines of Code

865K 10K 1K
autogluon
autogluon-multimodal

Fast and Accurate ML in 3 Lines of Code

806K 10K 1K
BoundaryML
baml-py

The AI framework that adds the engineering to prompt engineering (Python/TS/Ruby/Java/C#/Rust/Go compatible)

413K 8K 415
lfoppiano
streamlit-pdf-viewer

Streamlit PDF viewer

262K 195 21
google
langextract

A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

218K 36K 3K
autogluon
autogluon-vision

Fast and Accurate ML in 3 Lines of Code

122K 10K 1K
autogluon
autogluon-text

Fast and Accurate ML in 3 Lines of Code

105K 10K 1K
awslabs
autogluon-extra

Fast and Accurate ML in 3 Lines of Code

35K 10K 1K
awslabs
autogluon-mxnet

Fast and Accurate ML in 3 Lines of Code

31K 10K 1K
auriti-labs
geo-optimizer-skill

GEO (Generative Engine Optimization) toolkit — audit, optimize, and make websites visible to AI search engines (ChatGPT, Perplexity, Claude, Gemini). Based on Princeton KDD 2024 research.

10K 369 40
AIMLPM
markcrawl

Fast Python web crawler for RAG and AI ingestion. Extracts clean Markdown from any site for LLMs and vector stores.

8K 2 0
autogluon
autogluon-eda

Fast and Accurate ML in 3 Lines of Code

7K 10K 1K
harumiWeb
exstruct

Conversion from Excel to structured JSON (tables, shapes, charts) for LLM/RAG pipelines, and autonomous Excel reading/writing by AI agents via CLI and MCP integration.

5K 141 22
cleanlab
cleanlab-studio

Client interface to Cleanlab Studio

4K 31 10
argrelay
argrelay

A data server to CLI tools with attribute search & Tab-completion in Bash shell

3K 22 1
amphi-ai
jupyterlab-amphi

visual data prep powered by python

3K 1K 106
emcf
thepipe-api

Get clean data from tricky documents, powered by vision-language models ⚡

3K 2K 99
nanonets
docstrange

Extract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple formats (Markdown, JSON, CSV, HTML) with intelligent structured data extraction and advanced OCR.

2K 1K 129
    • Data from PyPI, GitHub, ClickHouse, and BigQuery