PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
akamhy
waybackpy

Wayback Machine API interface & a command-line tool

2.6M 575 41
ArchiveBox
archivebox

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

6K 27K 2K
webis-de
archive-query-log

📜 The Archive Query Log.

3K 34 0
muchdogesec
history4feed

Creates a complete full text historical archive for an RSS or ATOM feed.

3K 131 5
pjsier
scrapy-wayback-middleware

Scrapy middleware for submitting URLs to the Internet Archive Wayback Machine

2K 11 3
bitdruid
pywaybackup

Query and download archive.org as simple as possible.

2K 111 22
GeiserX
wayback-archive

A comprehensive tool for downloading and archiving websites from the Wayback Machine

2K 8 3
Barabazs
archivooor

Archivooor is a Python package for interacting with the archive.org API.

2K 3 1
alonebeast002
beastcrypt

​Advanced JS Reconnaissance Tool | Wayback & Katana Integration | Auto-Source Map Discovery Automated engine to hunt for exposed secrets, API keys, and sensitive endpoints by analyzing historical JS files and automatically locating hidden .map files.

1K 0 0
agude
wayback-machine-archiver

A Python script to submit web pages to the Wayback Machine for archiving.

1K 84 12
eggplants
wbsv

CLI for archiving pages and its all links to Wayback Machine

979 14 5
melon-dog
wayback-utils

Wayback Machine utils (web.archive.org)

784 0 0
GeiserX
wayback-diff

Intelligent web page comparison tool with Wayback Machine support and visual regression testing

716 1 0
Own-Data-Privateer
hoardy-web

Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing, replay, mirroring, data scraping, and/or indexing. Your own personal private Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data.

599 120 10
claromes
waybacktweets

Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing, and saves the data.

541 199 49
kenlhlui
pyarchiveit

A Python library to interact with the Internet Archive's Archive-It Account API (https://support.archive-it.org/hc/en-us/articles/360032747311-Access-your-account-with-the-Archive-It-Partner-API)

535 0 0
sbaack
archive-md-urls

Turn URLs in Markdown files into archive.org snapshots

474 7 2
sangaline
scrapy-wayback-machine

A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.

400 122 32
sangaline
wayback-machine-scraper

A command-line utility for scraping Wayback Machine snapshots from archive.org.

396 477 81
BGforgeNet
yawbdl

A tool to download pages from Internet Archive.

248 21 4
connor-marchand
gau-python

This library gets urls from AlienVault's Open Threat Exchange, the Wayback Machine, and Common Crawl. Inspired by Corbin Leo's gau

240 3 0
jfilter
get-wayback-machine

Fetch a URL via the latest Wayback Machine Snapshot

211 5 1
shuuji3
twilog-web-archiver

Save month list pages of twilog.org by using archive.org

209 3 0
Own-Data-Privateer
hoardy-web-sas

Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing, replay, mirroring, data scraping, and/or indexing. Your own personal private Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data.

204 120 10
    • Data from PyPI, GitHub, ClickHouse, and BigQuery