GrobidArticleExtractor is a Python package designed to extract and organize content from scientific papers in PDF format.
Extract text and tables from PDF files
CLI for merging PDF contexts.
Explore a website recursively and download all the wanted documents (PDF, ODT…)