53 dependents
Package Description Downloads/month
Open WebUI 1.3M
Yet Another Document Translator 68K
Your AI second brain. Self-hostable. Get answers from the web or your docs. Buil... 61K
MCP server for Google search and page fetching using headless Chromium 4K
Structured text extraction framework for digital and scanned PDFs with inline fo... 4K
MCP server that provides computer control capabilities, like mouse, keyboard, OC... 4K
Computer vision, OCR, and input automation toolkit 4K
Kotones Auto Assistant(kaa) is a script for game 'Gakuen Idol M@ster' that autom... 3K
Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like... 2K
Kingsoft Cloud Agent Development Kit - 支持 LangChain/LangGraph/DeepAgents/ADK/Ope... 2K
Kotonebot is game automation library based on computer vision technology, works ... 2K
AI Translation Framework - CLI tool for batch translation. 1K
End-to-end Optical Music Recognition (OMR) system build on top of vision transfo... 1K
A professional spider tools package 1K
视频翻译工具,擅长将非中文视频翻译成中文视频 1K
DISK (Domain Incremental conStruction of Knowledge graph) - A tool for distillin... 944
Question answering for local knowledge bases with exact source citations. 933
AI-native programming language for deterministic, inspectable applications. 884
FlexiData is an open-source Python package designed for processing unstructured ... 720
基于 RapidOCR 的高性能本地 OCR 技能 702
Enhanced MCP server for computer control: mouse, keyboard, screenshots, OCR, UI ... 687
625
Automatically redacts sensitive data in screenshots before sending to AI agents 594
The web version of RapidOCR 582
DocTranslater — PDF translation with layout preservation and multi-provider LLM ... 567
Production-ready document parsing with Vision Language Models 489
Auto-compress PDFs and screenshots to Markdown before they hit Claude Code's con... 432
HUI2 is an advanced Android automation library that combines the power of uiauto... 288
286
Add your description here 240
AI browser agent that costs 50x less. DOM + OCR + Memory. Works with any LLM. 206
RAGFast is a LLM Framework for building RAG Application with ease of code. 200
A comprehensive text extraction tool supporting multiple file formats 150
A CLI-based hard fork of AiNiee for batch translation. 146
A versatile OCR and document processing command-line tool. 142
BabelDOC fork with enhanced UI integration support (adds --log-progress) 139
A simple OCR SDK based on RapidOCR and Flask 139
OCR-powered Python library to auto-generate personalized certificates from templ... 123
Tools of extracting PDF content based on RapidOCR 114
This patch includes several improvement to RAGs, summarizers, downloders and ret... 114
LOOP Chat 105
MCP server that provides computer control capabilities, like mouse, keyboard, OC... 97
智能笔记助手 CLI 94
MCP server that provides computer control capabilities, like mouse, keyboard, OC... 84
MCP server that provides computer control capabilities, like mouse, keyboard, OC... 76
Open WebUI steve 73
A screenshot tool that automatically copies extracted text to clipboard 71
convert GIS WMTS tile image to polygon and point using computer vision 68
有赞文档处理 SDK - 支持多格式文档加载与切分 61
A package for Pensieve (previously known as memos) 57