PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
firecrawl
firecrawl-py

🔥 The API to search, scrape, and interact with the web for AI

6.8M 114K 7K
firecrawl
firecrawl

🔥 The API to search, scrape, and interact with the web for AI

677K 114K 7K
apify
crawlee

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

538K 9K 712
scrapfly
scrapfly-sdk

Official Python SDK for the Scrapfly platform: web scraping, screenshots, AI extraction, crawling, and a remote anti-bot browser. Integrates with Scrapy, LlamaIndex, and LangChain.

313K 55 15
ScrapeGraphAI
scrapegraph-py

Official Python SDK for the ScrapeGraph AI API. Smart scraping, search, crawling, markdownify, agentic browser automation, scheduled jobs, and structured data extraction

296K 71 14
jpjacobpadilla
stealth-requests

Undetected web-scraping & seamless HTML parsing in Python!

58K 467 48
tobocop2
lilbee

Terminal-first local search and AI chat over your documents, code, and crawled websites. Semantic + hybrid search, vision OCR, auto-built wiki, browsable GGUF model catalog. Works as CLI, TUI, MCP server, REST API, or Python library. Offline by default, no sidecar services.

20K 16 3
kreuzberg-dev
kreuzcrawl

High-performance web crawling engine with bindings for 11 languages

14K 84 10
spider-rs
spider-rs

Spider ported to Python

7K 106 17
us
crw

Fast, lightweight Firecrawl alternative in Rust. Web scraper, crawler & search API with MCP server for AI agents. Drop-in Firecrawl-compatible API (/v1/scrape, /v1/crawl, /v1/search). 2.3x faster than Tavily, 1.5x faster than Firecrawl in 1K-URL benchmarks. 6 MB RAM, single binary. Self-host or use managed cloud.

3K 71 5
abo123456789
redis-queue-tool

Distributed task redisqueue(最简单python分布式函数调度框架)

3K 65 19
jpjacobpadilla
search-ai-core

Search the web with advanced filters and LLM-friendly output formats!

3K 56 2
abo123456789
leek

Distributed task redisqueue(最简单python分布式函数调度框架)

2K 65 19
GeiserX
wayback-archive

A comprehensive tool for downloading and archiving websites from the Wayback Machine

2K 8 3
BHM-Bob
mbapy

BA_PY: Optimize Your Workflow with Python!

1K 3 1
MehmetYukselSekeroglu
hivewebcrawler

Simple Python 3.x Web Crawler, Images, Urls, Emails, Phone numbers

1K 3 1
dotnetpower
infomesh

Fully decentralized P2P search engine for LLMs via MCP

832 3 0
MoonyFringers
ladon-crawl

A Python framework for building structured, resumable web crawlers — designed for domains where data quality matters.

758 1 1
Algebra-FUN
wereadscan

扫描“微信读书”已购图书并下载本地PDF的爬虫

748 991 171
imyourboyroy
web-scraper-toolkit

A powerful, standalone web scraping toolkit using Playwright and various parsers.

740 5 2
rivermont
spidy-web-crawler

Spidy is the simple, easy to use command line web crawler.

734 352 69
MoonyFringers
ladon-hackernews

Hacker News adapter for the Ladon crawler framework

711 0 1
Kochat-framework
kochat

Korean opensource chatbot framework

543 462 186
aneesh-aparajit
reddit-multimodal-crawler

A scraper which will scrape out multimedia data from reddit.

351 11 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery