13 dependents
Package Description Downloads/month
Extract embedded metadata from HTML markup 341K
Remove DIVs, style stuff and normalize HTML preserving structure information 19K
Parsing of data from web pages. 19K
6K
mini-framework 2K
headless implementation for the Reasonify Agent 1K
Sync transactions into your YNAB Unlinked account from a bank generated supporte... 1K
Document parsing tool for LLM training and Rag 327
audit which email addresses can be collected by bots from your sites. 185
TrustRAG:The RAG Framework within Reliable input,Trusted output 121
这是一个从 RagFlow 项目的 DeepDoc 模块中抽取出来的专门用于 PDF 解析的 Python 库。它提供了强大的 PDF 文档解析能力,支持 OC... 114
Python boilerplate 78
77