386 dependents
Package Description Downloads/month
🎭 Playwright integration for Scrapy 922K
advertools - online marketing productivity and analysis tools 147K
news-please - an integrated web crawler and information extractor for news that ... 118K
Zyte API integration for Scrapy 109K
llama-index readers web integration 86K
Scrapy+Splash for JavaScript integration 70K
A service daemon to run Scrapy spiders 47K
Redis-based components for Scrapy. 35K
Command line client for Scrapyd server 34K
Scrapy download handler that can impersonate browser' TLS signatures or JA3 fing... 34K
Scrapy middleware to handle javascript pages using selenium 26K
Scrapy entrypoint for Scrapinghub job runner 23K
Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy 22K
Workflow manager for Zyte ScrapyCloud tasks. 21K
Inspire JSON schemas and utilities to use them. 20K
Page Object pattern for Scrapy 15K
Anti-detection Scrapy middleware — proxy routing and browser rendering for web s... 13K
A Python utils for spider 9K
AI Execute Services - A middleware framework for AI-powered task execution and t... 9K
Utilities to extend Scrapy spiders with usable metadata. 9K
Run a Scrapy spider programmatically from a script or a Celery task - no project... 9K
Tools for helping build of extraction models with scrapy spiders. 8K
使 scrapy 开发不用在意 item,pipeline,middleware 等通用场景下模块的编写,解放开发者的双手。 8K
Core functionality for City Scrapers projects 8K
A random user-agent for all your needs 7K
6K
A decorator to write coroutine-like spider callbacks. 5K
A downloader middleware to change user-agent of scrapy 5K
An extension that allows a user to display all or some of their scrapy spider se... 5K
A short description of the package. 5K
Clappform Python scraper 5K
Spider templates for automatic crawlers. 4K
A Scrapy middleware for accessing ZenRows Scraper API with minimal setup. 4K
Discarding duplicate URLs based on rules. 4K
支持拷贝漫画, Māngabz, 禁漫天堂, wnacg, exhentai, hitomi.la, h-comic , kemono, danbooru | ... 4K
Scrapy spider middleware to ignore requests to pages containing items seen in pr... 4K
Scrapy utils for Modis crawlers projects. 3K
A Scrapy middleware to bypass the CloudFlare's anti-bot protection 3K
Tools to easy generate RSS feed that contains each scraped item using Scrapy fra... 3K
HTTP API for Scrapy spiders 3K
Standardized project configuration for assetutilities 2K
More flexible and featured Frontera scheduler for Scrapy 2K
Board games data scraping and processing from BoardGameGeek and more! 2K
Do automated crawling of pages using scrapy 2K
Scrapy middleware for submitting URLs to the Internet Archive Wayback Machine 2K
Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vu... 2K
Gathering the stock data 2K
Library which uses Canvasapi (see https://canvasapi.readthedocs.io) to provide a... 2K
Simple scrapy proxy pool 2K
xx
2K