Web Crawling Python Packages

crawlee

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

541K 9K 712

botasaurus

The All in One Framework to Build Undefeatable Scrapers

35K 4K 388

bota

The All in One Framework to Build Undefeatable Scrapers

28K 4K 388

javascript-fixes

The All in One Framework to Build Undefeatable Scrapers

27K 4K 388

botasaurus-humancursor

The All in One Framework to Build Undefeatable Scrapers

27K 4K 388

botasaurus-server

The All in One Framework to Build Undefeatable Scrapers

1K 4K 388

genmine

GenBank Record downloader for taxonomists

982 8 0

microwler

A micro-framework for asynchronous deep crawls and web scraping with Python

584 13 1

bose

The All in One Framework to Build Undefeatable Scrapers

523 4K 385

nightcrawler-mitm

A python program that crawls a website and tries to stress it, polluting forms with bogus data

347 26 1

webtranspose

Web scraping API for building AI applications.

255 40 2

agentsearchcli

Give any AI agent the ability to search, crawl, and extract the web.

240 2 0

katastrophe

Download torrents from kat.ph directly through terminal

182 82 12

pg-cache-storage

The All in One Framework to Build Undefeatable Scrapers

151 4K 388

insite

A lightning fast tool for crawling websites and compiling PDFs of their pages

146 1 0

thordata-firecrawl

Thordata Firecrawl – Firecrawl-compatible web crawling & scraping API built on Thordata, turning any website into AI-ready Markdown/JSON/HTML/screenshots.

135 2 0