Web Crawler Python Packages

firecrawl-py

🔥 The API to search, scrape, and interact with the web for AI

6.8M 114K 7K

firecrawl

🔥 The API to search, scrape, and interact with the web for AI

677K 114K 7K

crawlee

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

538K 9K 712

scrapfly-sdk

Official Python SDK for the Scrapfly platform: web scraping, screenshots, AI extraction, crawling, and a remote anti-bot browser. Integrates with Scrapy, LlamaIndex, and LangChain.

313K 55 15

scrapegraph-py

Official Python SDK for the ScrapeGraph AI API. Smart scraping, search, crawling, markdownify, agentic browser automation, scheduled jobs, and structured data extraction

296K 71 14

stealth-requests

Undetected web-scraping & seamless HTML parsing in Python!

58K 467 48

lilbee

Terminal-first local search and AI chat over your documents, code, and crawled websites. Semantic + hybrid search, vision OCR, auto-built wiki, browsable GGUF model catalog. Works as CLI, TUI, MCP server, REST API, or Python library. Offline by default, no sidecar services.

20K 16 3

kreuzcrawl

High-performance web crawling engine with bindings for 11 languages

14K 84 10

spider-rs

Spider ported to Python

7K 106 17

crw

Fast, lightweight Firecrawl alternative in Rust. Web scraper, crawler & search API with MCP server for AI agents. Drop-in Firecrawl-compatible API (/v1/scrape, /v1/crawl, /v1/search). 2.3x faster than Tavily, 1.5x faster than Firecrawl in 1K-URL benchmarks. 6 MB RAM, single binary. Self-host or use managed cloud.

3K 71 5