PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
opendatalab
mineru-html

MinerU-HTML: An SLM-powered HTML main content extractor that outputs clean HTML bodies. Perfect for Deep Research Agents, RAG applications, and training data generation.

626 239 25
mazzasaverio
url2md4ai

Lean Python tool for extracting clean, LLM-optimized markdown from web pages. Handles dynamic content with Playwright + Trafilatura for maximum information extraction efficiency.

214 4 0
Yasser03
pipescraper

A pipe-based news article scraping and metadata extraction library

98 2 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery