PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
adbar
trafilatura

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

7.2M 6K 363
firecrawl
firecrawl-py

🔥 The API to search, scrape, and interact with the web for AI

6.8M 114K 7K
firecrawl
firecrawl

🔥 The API to search, scrape, and interact with the web for AI

677K 114K 7K
spider-rs
spider-client

Python, Javascript, and Rust libraries for the Spider Cloud API.

413K 25 9
Spenhouet
confluence-markdown-exporter

Export Atlassian Confluence pages as markdown files.

36K 396 104
tim-gromeyer
pyhtml2md

Transform your HTML into clean, easy-to-read markdown with html2md.

25K 81 11
us
crw

Fast, lightweight Firecrawl alternative in Rust. Web scraper, crawler & search API with MCP server for AI agents. Drop-in Firecrawl-compatible API (/v1/scrape, /v1/crawl, /v1/search). 2.3x faster than Tavily, 1.5x faster than Firecrawl in 1K-URL benchmarks. 6 MB RAM, single binary. Self-host or use managed cloud.

3K 71 5
pankaj28843
article-extractor

Pure-Python article extraction library and HTTP API - Extract clean content from web pages as Markdown or HTML

2K 0 0
muchdogesec
file2txt

file2txt is a Python library takes common file formats and turns them into plain text (a txt file) with Markdown styling.

1K 12 2
paulpierre
markdown-crawler

A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG

1K 441 53
nanonets
llm-data-converter

Best open-source document to markdown converter for LLM training data. Convert PDF, Word, PowerPoint, Excel, images, URLs to clean markdown, JSON, HTML locally. Alternative to Unstructured, Docling, Marker, MarkItDown, MinerU, PaddleOCR, Tesseract

900 7 1
QuartzUnit
markgrab

Universal web content extraction — URL to LLM-ready markdown

762 0 0
renesugar
html2txt

Convert HTML to markdown

349 1 2
nanonets
document-data-extractor

Convert any document format into LLM-ready data format (markdown) with advanced intelligent document processing capabilities powered by pre-trained models.

254 7 1
mazzasaverio
url2md4ai

Lean Python tool for extracting clean, LLM-optimized markdown from web pages. Handles dynamic content with Playwright + Trafilatura for maximum information extraction efficiency.

214 4 0
spider-rs
spiderwebai-py

Python, Javascript, and Rust libraries for the Spider Cloud API.

195 25 9
yannickperrenet
bookmarkdown

✅ Parse your browser's exported HTML bookmark file to Markdown.

143 18 0
spider-rs
spiderclient-py

Python, Javascript, and Rust libraries for the Spider Cloud API.

32 25 9
spider-rs
spidercloud-py

Python, Javascript, and Rust libraries for the Spider Cloud API.

32 25 9
trubitsyn
bookmarks2markdown

Convert bookmarks to Markdown

3 5 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery