PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
adbar
trafilatura

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

7.2M 6K 363
fhamborg
news-please

news-please - an integrated web crawler and information extractor for news that just works

118K 2K 452
flairNLP
fundus

A very simple news crawler with a funny name

5K 452 108
lumyjuwon
koreanewscrawler

A korean news crawler built to ingest large amounts of news data.

412 225 105
thinh-vu
shutterstock-analysis

The internet's very first python package supports analyzing the Shutterstock public data, which helps creators optimize their creative portfolio and earn more income with less effort.

351 1 0
thinh-vu
ur-gadget

A Python package that helps capture news updates from top Vietnamese news sites

184 1 0
johnbumgarner
newshound

This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around the world in over 50 languages.

86 34 3
divkakwani
webcorpus

Generate large textual corpora for almost any language by crawling the web

82 9 11
thinh-vu
vnnews

A Python package that helps capture news updates from top Vietnamese news sites

1 1 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery