178 dependents
Package Description Downloads/month
Docling LangChain integration 148K
Running a distributed job processing documents with Docling. 54K
Making docling agentic through MCP 41K
Self-hosted semantic search and knowledge management for LLM-driven development 40K
Running Docling as an API service 29K
OnnxTR OCR plugin for Docling 26K
Tool to work with arXiv, provide LLM with ability to search and read papers fro... 23K
llama-index readers docling integration 22K
This package enables inference of header hierarchy in the docling PDF parsing pi... 13K
Personal knowledge base CLI - aggregate content from multiple sources 11K
Building blocks for rapid development of GenAI applications 10K
OpenDsStar is an open-source implementation of the DS-Star agent that replaces f... 9K
Docling plugin for Surya OCR 8K
Python library for Synthetic Data Generation 7K
InstructLab Core package. Use this to chat with a model and execute the Instruc... 7K
📚 Process PDFs, Word documents and more with spaCy 5K
Additional packages (components, document stores and the likes) to extend the ca... 5K
Composable document data extraction: load, preprocess, OCR, LLM parse, store wit... 5K
Agentic web research tool. Smarter than search, faster than deep research. Searc... 4K
AI-assisted table configuration generation for Tablassert — entity resolution, Y... 4K
Self-hosted personal AI assistant with Feishu integration, a built-in web consol... 4K
A standalone, extensible RAG/MCP library for building AI-powered documentation s... 4K
Librairie outils IA Lexia par Lexfluent 3K
A modular text-based database manager for retrieval-augmented generation (RAG), ... 3K
Transform unstructured documents into validated, rich and queryable knowledge gr... 3K
Multimodal RAG with knowledge graph and contextual intelligence. Understands wha... 2K
Academic research MCP server — search, extract, and manage papers 2K
Privacy-first document intelligence engine — converts PDFs, DOCX, PPTX, XLSX, an... 2K
Evaluation of Docling 2K
Implements LangGraph based agents following the CoALA framework. 2K
Convert files from various sources (SharePoint, S3, Azure Blob, etc.) to Markdow... 2K
Document loading helpers for Donkit RagOps 2K
MCP server for IBM watsonx.data Integration 1K
Extract structured Markdown, tables, figures, and equations from scientific PDFs... 1K
Search Inference Engine - GPU inference server for search workloads 1K
MCP server for reading and searching EPUB/PDF documents 1K
Docs2KG: A Human-LLM Collaborative Approach to Unified Knowledge Graph Construct... 1K
A minimalistic RAG system that prevents hallucination by ensuring all generated ... 1K
"DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)" 1K
A real-time interactive Omni Avatar built on LiveKit, which allows you to seamle... 1K
Core library and CLI for DocRunr document processing. 1K
Turn your documents into an analytics-ready wide table. 995
A comprehensive PDF processing toolkit that converts PDFs to markdown with advan... 974
The Perception Engine for AI Agents. Distilling web pages and PDFs into clean Ma... 960
Docling → Chroma → Ollama: Simple RAG pipeline 942
AI-powered CLI for analyzing hardware engineering documents 937
Question answering for local knowledge bases with exact source citations. 933
Flexible GraphRAG system supporting multiple LLM providers, graph databases, vec... 906
WISTX MCP Server - DevOps compliance and pricing context for coding agents 905
Agent that read, write and edit documents. 876