PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
scrapy
protego

A pure-Python robots.txt parser with support for modern conventions.

5.5M 86 30
GateNLP
ultimate-sitemap-parser

Ultimate Website Sitemap Parser

180K 250 76
eliasdabbas
advertools

advertools - online marketing productivity and analysis tools

147K 1K 240
simonw
datasette-block-robots

Datasette plugin that blocks robots and crawlers using robots.txt

5K 7 0
nzrsky
fast-robotstxt

Fast and modern robots.txt parser and matcher (C++20). Fork of Google's library with zero-copy parsing (~30% faster) and RFC 9309 compliance fixes.

3K 5 0
jwmorley73
jwm-robotstxt

Provides python access to Googles parser for robot.txt files as used by their GoogleBot webscraper.

2K 1 0
benwebber
texting-robots-py

Python binding for Texting Robots

2K 0 0
OwenOrcan
yirabot

YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.

1K 17 0
jatsu
django-cs-robots

A django app to change robots.txt from the admin panel without using the database.

476 0 0
beb7
greenflare

Open-Source Python Based SEO Web Crawler

385 196 21
KnuckleheadsClub
rbdt

rbdt is a python library (written in rust) for parsing robots.txt files for large scale batch processing.

372 0 0
meysam81
sitemap-harvester

Crawl sitemap of a given website and export metadata of its pages recursively into CSV format.

324 5 0
alexjc
weboptout

Opt-Out tool to check Copyright reservations in a way that even machines can understand.

258 194 1
serpwings
pyrobotstxt

pyrobotstxt: Python Package for robots.txt Files

212 4 0
hanselhansel
context-linter

LLM readiness linter for websites. Audits robots.txt, llms.txt, Schema.org, and content density on a 0-100 scale. Includes MCP server. Published on PyPI: pip install context-cli.

175 3 1
simplecto
sitemap-grabber

Grab and recursively parse website sitemaps, robots.txt, and other related files.

141 1 0
William-Fernandes252
astel

An asyncronous web crawling library for Python.

92 0 0
hanselhansel
aeo-cli

Agentic Engine Optimization CLI — audit URLs for AI crawler readiness

24 3 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery