PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Docx Python Packages

Python packages with the GitHub topic docx. Sorted by relevance, with stars and monthly downloads.
nolze
msoffcrypto-tool

Python tool and library for decrypting and encrypting MS Office files using passwords or other keys

9M 616 91
docling-project
docling

Get your documents ready for gen AI

6M 59K 4K
Unstructured-IO
unstructured

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

5.3M 15K 1K
pqzx
htmldocx

Convert html to docx

899K 87 59
docling-project
docling-slim

Get your documents ready for gen AI

294K 59K 4K
opendatalab
mineru

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

283K 62K 5K
dfop02
html-for-docx

Convert html to docx

279K 61 15
opendatalab
magic-pdf

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

76K 62K 5K
yobix-ai
extractous

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

58K 2K 96
shloktech
md2docx-python

Simple and straight forward Python utility that converts a Microsoft Word document (`.docx`) to a Markdown file (`.md`) and vice versa. It supports multiple Markdown elements, including headings, bold and italic text, both unordered and ordered lists, and many more.

15K 47 5
Luizhcrs
template-engine-ia

Document normalization engine — learn a template from examples and convert any document automatically via LLM

10K 1 0
shcherbak-ai
contextgem

ContextGem: Effortless LLM extraction from documents

9K 2K 155
BramAlkema
openxml-audit

Validate Office files in pure Python with Open XML SDK parity, pytest fixtures, and CI hooks.

7K 1 0
u9401066
asset-aware-mcp

Asset-Aware MCP Server — AI Agent precisely accesses tables, figures, sections from PDFs + .docx round-trip editing (DFM) with 46 tools / 13 resources, segmentation export, layout overlay, OCR preprocessing, knowledge graph (LightRAG)

6K 0 0
PlateerLab
document-adapter

LLM이 DOCX/PPTX/HWPX 문서를 직접 편집할 수 있게 해주는 통합 어댑터 + MCP 서버. Claude Desktop / Claude Code / Anthropic API Tool Use 호환. pip install document-adapter

6K 0 0
ykarapazar
word-mcp-live

The only MCP server that edits Word documents while they're open — 114 tools, live editing, tracked changes, per-action undo

6K 64 17
badbye
docxpy

A pure python based utility to extract text and images from docx files.

6K 5 4
explosion
spacy-layout

📚 Process PDFs, Word documents and more with spaCy

5K 894 64
sunholo-data
ailang-parse

Universal document parsing and generation in AILANG. Deterministic Office (DOCX/PPTX/XLSX) extraction, AI-powered PDF/image parsing, 9-format document generation.

5K 0 0
farfarfun
funread

文档阅读和解析工具包 - 支持多种文档格式的读取和解析

4K 1 0
rocklambros
any2md

Convert PDF, DOCX, HTML, and TXT files — or web pages by URL — to clean, LLM-optimized Markdown with YAML frontmatter.

4K 15 2
opendatalab
mineru-selfhosted-mcp

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

4K 62K 5K
henrihapponen
docxedit

Edit Word (.docx) documents effortlessly without changing the original formatting.

3K 23 3
turulomio
unogenerator

Libreoffice files generator programmatically with python and Libreoffice server instances

3K 15 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery