PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
dlt-hub
dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

7.5M 5K 498
tavily-ai
tavily-python

The Tavily Python SDK allows for easy interaction with the Tavily API, offering the full range of our search, extract, crawl, map, and research functionalities directly from your Python programs. Easily integrate smart search, content extraction, and research capabilities into your applications, harnessing Tavily's powerful features.

4.8M 1K 152
lipoja
urlextract

URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.

801K 277 64
nexB
extractcode

A mostly universal file extraction library and CLI tool to extract almost any archive in a reasonably safe way on Linux, macOS and Windows.

73K 38 23
OmkarPathak
pyresparser

A simple resume parser used for extracting information from resumes

7K 957 448
MicheleCotrufo
pdf2doi

A python library/command-line tool to extract the DOI or other identifiers of a scientific paper from a pdf file.

5K 135 28
Breaka84
spooq

Spooq is a PySpark based helper library for ETL data ingestion pipeline in Data Lakes.

4K 10 2
fedecalendino
pysub-parser

Library for extracting text and timestamps from multiple subtitle files (.ass, .ssa, .srt, .sub, .txt).

4K 53 5
dlt-hub
dlt-core

dlt is an open-source python-first scalable data loading library that does not require any backend to run.

3K 5K 498
jlu5
icoextract

Extract icons from Windows PE files (.exe/.dll)

2K 150 9
hrushikeshrv
docxlatex

A python library for extracting equations, text, and images from .docx files

2K 20 3
JoshuaMKW
pyisotools

python library for working with Gamecube ISOs (GCM)

1K 45 9
Mellow-Artificial-Intelligence
openextract

Extract structured data from documents, images, audio, and video using LLMs.

1K 16 2
Junbo-Zheng
miwear

Python Miwear tools for extracting and handling archives/logs

1K 5 1
camelot-dev
excalibur-py

A web interface to extract tabular data from PDFs

1K 2K 237
MicheleCotrufo
pdf2bib

A python library/command-line tool to quickly and automatically generate BibTeX data starting from the pdf file of a scientific publication.

1K 89 11
0xMassi
webclaw

Python SDK for the Webclaw web extraction API

987 1 0
myifeng
article-parser

Extract article or news by url or html, parse the title and content, output in markdown format.

824 50 6
dopstar
ftransc

The Audio Converter

747 17 1
xiaohuohumax
auto-unpack

压缩包自动解压工具,支持多种压缩包格式。通过组合各种插件,编排流程,则可满足日常解压需求。

743 21 4
vishaltanwar96
aadhaar-py

Extract embedded information from Aadhaar Secure QR Code.

570 15 1
jlw4049
automaticdemuxer

Automatically Demux tracks from media-files

522 2 0
SermetPekin
pdfsp

Extracts data from PDF files and saves it to Excel files.

489 1 0
voidful
wikiext

Extract Knowledge from wiki dump file

404 6 3
    • Data from PyPI, GitHub, ClickHouse, and BigQuery