97 dependents
| Package | Description | Downloads/month |
|---|---|---|
| Academia MCP server: Tools for automatic scientific research | 12K | |
| 8K | ||
| A Python library for processing receipts, extracting key information, and assist... | 8K | |
| 7K | ||
| 4K | ||
| Structured text extraction framework for digital and scanned PDFs with inline fo... | 4K | |
| Modular media quality metrics toolkit. | 3K | |
| Agent S: an open agentic framework that uses computers like a human | 3K | |
| 3K | ||
| Query public data sources worldwide through a unified CLI and REST API | 3K | |
| DocumentAI-std is a Python library designed to facilitate and standardize docume... | 2K | |
| 2K | ||
| Fast PaddleOCR MCP server - Extract text from images using PaddleOCR with optimi... | 2K | |
| Ingest sources with proper citation — PDF, URL, media, Office, DJVU | 1K | |
| 基于自然语言的,跨端跨框架 BDD UI 自动化测试方案,BDD testing, Python style, Present by Trip Flight | 1K | |
| 1K | ||
| A library for electronic Know Your Customer (eKYC) verification | 1K | |
| Video Archive AI analysis tool | 1K | |
| Huggingface bolts for geniusrise | 1K | |
| Collection of Taiwan Rental House Data from Public Website | 1K | |
| Parse, extract, and analyze documents with ease | 1K | |
| 阻止群成员发送广告内容,过滤内容可配置 | 1K | |
| 963 | ||
| Extract text and information from pdf files | 929 | |
| An AI assistant powered by Llama models | 913 | |
| 一款适用于QQ群聊天的语录库插件 | 900 | |
| Convert the model in PaddleOCR to ONNX format | 819 | |
| FlexiData is an open-source Python package designed for processing unstructured ... | 720 | |
| 718 | ||
| Plugins to enable usage of PaddleOCR in ocr_translate | 716 | |
| Exploit computer vision technology with Orange Data Mining ! | 642 | |
| 🔥地址解析识别python版本 | 607 | |
| Actscene OCR: 日本語書類向けの包括的OCRパイプライン (PaddleOCRベース) | 601 | |
| airclick 相关python包 | 569 | |
| A fast automatic number-plate recognition (ANPR) library | 562 | |
| 跨平台的UI自动化框架,适用于混合型app | 554 | |
| OCR, Archive, Index and Search: Implementation agnostic OCR framework. | 509 | |
| Privision 是一款强大的视频内容脱敏工具,采用先进的 OCR 技术自动识别并打码视频中的敏感信息。支持手机号、身份证号、自定义关键字等多种检测模式,提供... | 469 | |
| Comic-Focused Hybrid OCR Library, made in python | 436 | |
| PaddleOCR engine plugin for OCRmyPDF | 432 | |
| Using LLM to parse PDF and get better chunk for retrieval | 428 | |
| A tool to classify images | 414 | |
| A modular QSR Order Verification Python Package | 381 | |
| Extracts citations from PDF, URLs and local media files in CSL-JSON. | 369 | |
| Use json5 for view-based workflows configuration | 340 | |
| A powerful tool to extract text, tables, charts, and formulas from documents and... | 329 | |
| A robust MRZ extraction and validation engine library designed for real-world K... | 328 | |
| rasa_contrib is a addon package for rasa. It provide some useful/powerful additi... | 317 | |
| Deterministyczny generator identyfikatorów dokumentów z OCR | 313 | |
| Local-first Python RAG pipeline with sentence-transformer embeddings, FAISS/BM25... | 307 |