386 dependents
| Package | Description | Downloads/month |
|---|---|---|
| 🎭 Playwright integration for Scrapy | 922K | |
| advertools - online marketing productivity and analysis tools | 147K | |
| news-please - an integrated web crawler and information extractor for news that ... | 118K | |
| Zyte API integration for Scrapy | 109K | |
| llama-index readers web integration | 86K | |
| Scrapy+Splash for JavaScript integration | 70K | |
| A service daemon to run Scrapy spiders | 47K | |
| Redis-based components for Scrapy. | 35K | |
| Command line client for Scrapyd server | 34K | |
| Scrapy download handler that can impersonate browser' TLS signatures or JA3 fing... | 34K | |
| Scrapy middleware to handle javascript pages using selenium | 26K | |
| Scrapy entrypoint for Scrapinghub job runner | 23K | |
| Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy | 22K | |
| Workflow manager for Zyte ScrapyCloud tasks. | 21K | |
| Inspire JSON schemas and utilities to use them. | 20K | |
| Page Object pattern for Scrapy | 15K | |
| Anti-detection Scrapy middleware — proxy routing and browser rendering for web s... | 13K | |
| A Python utils for spider | 9K | |
| AI Execute Services - A middleware framework for AI-powered task execution and t... | 9K | |
| Utilities to extend Scrapy spiders with usable metadata. | 9K | |
| Run a Scrapy spider programmatically from a script or a Celery task - no project... | 9K | |
| Tools for helping build of extraction models with scrapy spiders. | 8K | |
| 使 scrapy 开发不用在意 item,pipeline,middleware 等通用场景下模块的编写,解放开发者的双手。 | 8K | |
| Core functionality for City Scrapers projects | 8K | |
| A random user-agent for all your needs | 7K | |
| 6K | ||
| A decorator to write coroutine-like spider callbacks. | 5K | |
| A downloader middleware to change user-agent of scrapy | 5K | |
| An extension that allows a user to display all or some of their scrapy spider se... | 5K | |
| A short description of the package. | 5K | |
| Clappform Python scraper | 5K | |
| Spider templates for automatic crawlers. | 4K | |
| A Scrapy middleware for accessing ZenRows Scraper API with minimal setup. | 4K | |
| Discarding duplicate URLs based on rules. | 4K | |
| 支持拷贝漫画, Māngabz, 禁漫天堂, wnacg, exhentai, hitomi.la, h-comic , kemono, danbooru | ... | 4K | |
| Scrapy spider middleware to ignore requests to pages containing items seen in pr... | 4K | |
| Scrapy utils for Modis crawlers projects. | 3K | |
| A Scrapy middleware to bypass the CloudFlare's anti-bot protection | 3K | |
| Tools to easy generate RSS feed that contains each scraped item using Scrapy fra... | 3K | |
| HTTP API for Scrapy spiders | 3K | |
| Standardized project configuration for assetutilities | 2K | |
| More flexible and featured Frontera scheduler for Scrapy | 2K | |
| Board games data scraping and processing from BoardGameGeek and more! | 2K | |
| Do automated crawling of pages using scrapy | 2K | |
| Scrapy middleware for submitting URLs to the Internet Archive Wayback Machine | 2K | |
| Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vu... | 2K | |
| Gathering the stock data | 2K | |
| Library which uses Canvasapi (see https://canvasapi.readthedocs.io) to provide a... | 2K | |
| Simple scrapy proxy pool | 2K | |
| 2K |