PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Web Crawling Python Packages

Python packages with the GitHub topic web-crawling. Sorted by relevance, with stars and monthly downloads.
apify
crawlee

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

541K 9K 712
omkarcloud
botasaurus

The All in One Framework to Build Undefeatable Scrapers

35K 4K 388
omkarcloud
bota

The All in One Framework to Build Undefeatable Scrapers

28K 4K 388
omkarcloud
javascript-fixes

The All in One Framework to Build Undefeatable Scrapers

27K 4K 388
omkarcloud
botasaurus-humancursor

The All in One Framework to Build Undefeatable Scrapers

27K 4K 388
omkarcloud
botasaurus-server

The All in One Framework to Build Undefeatable Scrapers

1K 4K 388
Changwanseo
genmine

GenBank Record downloader for taxonomists

982 8 0
INNOVINATI
microwler

A micro-framework for asynchronous deep crawls and web scraping with Python

584 13 1
omkarcloud
bose

The All in One Framework to Build Undefeatable Scrapers

523 4K 385
thesp0nge
nightcrawler-mitm

A python program that crawls a website and tries to stress it, polluting forms with bogus data

347 26 1
mike-gee
webtranspose

Web scraping API for building AI applications.

255 40 2
r0botsorg
agentsearchcli

Give any AI agent the ability to search, crawl, and extract the web.

240 2 0
alyakhtar
katastrophe

Download torrents from kat.ph directly through terminal

182 82 12
omkarcloud
pg-cache-storage

The All in One Framework to Build Undefeatable Scrapers

151 4K 388
heleusbrands
insite

A lightning fast tool for crawling websites and compiling PDFs of their pages

146 1 0
Thordata
thordata-firecrawl

Thordata Firecrawl – Firecrawl-compatible web crawling & scraping API built on Thordata, turning any website into AI-ready Markdown/JSON/HTML/screenshots.

135 2 0
michellepellon
pricetag

Extract price and currency information from unstructured text. Pure Python library for e-commerce data extraction and web scraping.

130 0 0
sadiuysal
iflow-mcp-sadiuysal-crawl4ai-mcp-server

A lightweight Model Context Protocol (MCP) server that exposes Crawl4AI web scraping and crawling capabilities as tools for AI agents.

126 72 10
innovinati
scrapy-googlechat

Send crawl reports from Scrapy spiders to Google Chat

118 1 1
HuberTRoy
seen

Supported JavaScript Web crawling framework for everyone.

106 13 3
William-Fernandes252
astel

An asyncronous web crawling library for Python.

98 0 0
omkarcloud
sqlite-cache-storage

The All in One Framework to Build Undefeatable Scrapers

71 4K 388
ZeroCool940711
new-frontera

A scalable frontier for web crawlers

63 0 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery