Pdftotext Python Packages

aiopytesseract

A Python asyncio wrapper for Tesseract-OCR.

2K 27 7

pyxpdf

Fast and memory-efficient Python PDF Parser based on xpdf sources

2K 44 17

pdf2dataset

Converts a whole subdirectory with a big (or small) volume of PDF documents to a dataset (pandas DataFrame) with error tracking and choice of features

596 19 5

pdftotext3

A simple pdftotext conversion tool for Windows 8.1/10/11 and FEDORA/UBUNTU/DEBIAN/ARCH based linux distros using poppler-utils and Google's tesseract-ocr.

342 22 2

imagetocsv

Converts An Image to a CSV. This exists because Chorus 3.0 are bat-shit and only show images for vital metadata.

237 5 2

Search Packages