PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
jshwi
docsig

Check Python signature params for proper documentation

380K 42 3
Byaidu
pdf2zh

[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero

17K 33K 3K
fcakyon
craft-text-detector

This repo is deprecated. Please refer to new up-to-date repo: https://github.com/fcakyon/craft-text-detector

13K 12 1
stencila
stencila

Python SDK for Stencila

7K 878 56
sunholo-data
ailang-parse

Universal document parsing and generation in AILANG. Deterministic Office (DOCX/PPTX/XLSX) extraction, AI-powered PDF/image parsing, 9-format document generation.

6K 0 0
farfarfun
funread

文档阅读和解析工具包 - 支持多种文档格式的读取和解析

4K 1 0
moskize91
doc-page-extractor

Document page extraction tool powered by DeepSeek-OCR.

4K 13 7
oomol-lab
pdf-craft

PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.

4K 5K 365
henrihapponen
docxedit

Edit Word (.docx) documents effortlessly without changing the original formatting.

4K 23 3
emcf
thepipe-api

Get clean data from tricky documents, powered by vision-language models ⚡

3K 2K 99
retospect
precis-summary

Fast extractive summarization via RAKE keyword extraction

2K 0 0
Michael-A-Kuykendall
contextlite

Database Freedom Platform - Mathematical search optimization for whatever database you already have. 27,000x faster than vector databases with SMT-powered search across 8+ database types. One-time 9-2999 vs 00-500/month recurring.

2K 17 5
osllmai
indox

The Indox Ecosystem offers integrated AI tools for data workflows. Our four components (IndoxArcg, IndoxMiner, IndoxJudge, and IndoxGen) enhance AI applications with advanced retrieval, extraction, evaluation, and generation capabilities, supporting multiple document formats and LLM providers.

2K 19 2
DavidSchobl
edof

Python library for programmatic document creation, template filling and export (.edof format)

2K 0 0
mortensi
jsondocstore

A Python library to store, read, and search JSON documents on disk. Runs in the application process

2K 2 0
pstwh
docuwarp

Unwarp documents

1K 7 0
Sinapsis-AI
sinapsis-langchain-readers

Package with sinapsis templates to support langchain functionality

1K 27 7
rossumai
docile-benchmark

DocILE: Document Information Localization and Extraction Benchmark

1K 146 12
Sinapsis-AI
sinapsis-langchain

Package with sinapsis templates to support langchain functionality

1K 27 7
kevin-leptons
fx-doc

A reStructuredText builder

1K 1 0
romanin-rf
rsv

A module for reading and writing an RSV document file.

1K 2 0
klich3
rocket-store

Using the filesystem as a searchable database.

882 1 0
osllmai
indoxjudge

Indox Judge

880 19 2
yushulx
document-scanner-sdk

Python document detection SDK built with Dynamsoft Document Normalizer for Windows and Linux

869 2 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery