18 dependents
| Package | Description | Downloads/month |
|---|---|---|
| Structured text extraction framework for digital and scanned PDFs with inline fo... | 4K | |
| AgentSociety 2 is a modern, LLM-native agent simulation platform designed for so... | 3K | |
| OmniDocs📄 - One stop visual document processing framework | 537 | |
| A tool for parsing PDF document layouts and chunking content | 504 | |
| A Python library for extracting and analyzing content from any documents, suppor... | 491 | |
| Docket Analyzer OCR Utility | 462 | |
| A simple and efficient RAG (Retrieval-Augmented Generation) library with Knowled... | 448 | |
| This package enables Retrieval-Augmented Generation (RAG) for PDF documents, enh... | 330 | |
| A powerful tool to extract text, tables, charts, and formulas from documents and... | 329 | |
| Crop image/table/code regions from PDF files and export metadata | 321 | |
| A tool for parsing PDF document layouts and chunking content. | 264 | |
| OCR tool for botanical documents using layout analysis and LLMs/OCR engines. | 196 | |
| Using GPT to parse PDF files and generate LaTeX code. | 188 | |
| A Comprehensive Toolkit for High-Quality PDF Content Extraction. | 176 | |
| A practical tool for converting PDF to Markdown | 167 | |
| DocRag: An advanced document search and retrieval system leveraging Retrieval-Au... | 153 | |
| An AI companion for reading papers. | 132 | |
| 116 |