PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
deepdoctection
deepdoctection

A Repo For Document AI

8K 3K 191
harumiWeb
exstruct

Conversion from Excel to structured JSON (tables, shapes, charts) for LLM/RAG pipelines, and autonomous Excel reading/writing by AI agents via CLI and MCP integration.

6K 141 22
nanonets
nanoindex

Agentic RAG Harness for long documents, Tree and Graph based reasoning. Cited answers down to the pixel

5K 49 5
deepdoctection
dd-core

A Repo For Document AI

3K 3K 191
tiroq
mdify-cli

MDify is a document-to-Markdown conversion library for extracting structured content from complex PDFs and document images, including tables, charts, and scanned documents.

3K 0 0
deepdoctection
dd-datasets

A Repo For Document AI

2K 3K 191
clovaai
donut-python

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

1K 7K 560
OpenDCAI
flash-mineru

Fast Inference Architecture for MinerU

793 49 7
fahmiaziz98
docvision

Production-ready document parsing with Vision Language Models

489 1 0
gregorymulla
grepctl

BigQuery Semantic Search Orchestrator

464 4 0
Keyvanhardani
german-ocr

High-performance German document OCR - Local & Cloud with GPU/CPU support

439 94 6
ChrBoebel
optical-context-mcp

MCP server that compresses OCR-heavy PDFs into dense packed images so AI agents can handle long visual documents

248 1 0
GramosoftAI
gdoczai

GDocz by Gramosoft is an open-source Intelligent Document Processing platform that turns raw PDFs and images into clean, structured JSON — powered by multi-engine OCR and AI-driven schema extraction.

207 6 1
harumiWeb
iflow-mcp-harumiweb-exstruct

Excel to structured JSON (tables, shapes, charts) for LLM/RAG pipelines

86 141 22
fahmiaziz98
doc-vision-parser

Python library for intelligent document parsing using Vision Language Models. Extract structured text and markdown from PDFs and images with self-correcting AI workflows. Supports OpenAI-compatible APIs.

5 1 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery