608 dependents
Package Description Downloads/month
Convert documents to structured data effortlessly. Unstructured is open-source E... 5.2M
A library for converting HTML into PDFs using ReportLab 3.5M
AKShare is an elegant and simple financial data interface library for Python, bu... 2.7M
A simple HTML content extractor in Python. Can be run as a wrapper for Mozilla's... 1.6M
Box of handy tools for Sphinx 🧰 📔 715K
RDFa 1.1 distiller/parser library: can extract RDFa 1.1 (and RDFa 1.0, if proper... 378K
Microformats parser 342K
The official source code for the python-mechanize project 299K
Functions and classes to access online data resources. Maintainers: @keflavich a... 290K
Automatic highly parallelized hyperparameter optimizer based on Ax/Botorch 274K
Spell checker automation tool 170K
OpenBB package with core functionality. 144K
Text Plugin for django CMS using CKEditor 4 110K
Abstra Lib 105K
Open Response Assessment Suite 96K
🕵️‍♂️ Collect a dossier on a person by username from 3000+ sites 81K
core xblocks 80K
:mag: ScanCode detects licenses, copyrights, dependencies by "scanning code" ...... 74K
An autonomous agent that conducts deep research on any data using any LLM provid... 74K
Framework for Task orchestration 73K
The official FlexGet repository 57K
Cleaning tool for web scraped text 55K
Microsoft Threat Intelligence Security Tools 53K
Package for generate headers to http requests. 53K
⛏⚽ Scrape soccer data from Club Elo, ESPN, FBref, Football-Data.co.uk, Sofascore... 43K
An API to scrape American court websites for metadata. 41K
A python-based mailcatcher clone 41K
A CLI that utilizes Okta IdP via SAML to acquire temporary AWS credentials 41K
pytest plugin that checks URLs 35K
Investment research for everyone, anywhere. 27K
The SnoVault general purpose hybrid object-relational database 24K
A Model Context Protocol server for a collection of financial tools, https://git... 22K
Software for technical documentation and requirements management. 20K
⚽ High-performance football analytics: build data pipelines, scrape data, model ... 19K
18K
Command-line interface for downloading WARN Act notices of qualified plant closi... 18K
Bloomerp is an open source Business Management Software framework that let's you... 18K
Django+AngularJS+Bootstrap library for fast development of CMS, ERP, Business Ma... 17K
A flexible framework for visualizing data and automated creation of "good enough... 16K
Easy access to official spatial data sets of Brazil in R and Python 14K
library to compare HTML while ignoring non-functional differences 13K
Webscout is the all-in-one search and AI toolkit you need. Discover insights wit... 13K
FanFicFare is a tool for making eBooks from stories on fanfiction and other web ... 12K
:bike: A preprocessor for anyone writing specifications that converts source fil... 11K
OneGov Cloud Framework based on Morepath 11K
Open source tools for Estonian natural language processing 11K
CLI tool for stripping tags from HTML 11K
Create HTML with python 3 using a standard DOM API. Includes a python port of Ja... 10K
Generate and download e-books from online sources. 10K
A high level interface and object model for the Notion SDK. 9K