PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
natasha
corus

Links to Russian corpora + Python functions for loading and parsing

4K 312 21
fostroll
corpuscula

Toolkit that simplifies corpus processing

2K 3 1
NetherlandsForensicInstitute
demeuk

Demeuk is a simple tool to clean up corpora (like dictionaries) or any dataset containing plain text strings.

1K 22 4
yonkornilov
opus-api

OPUS (opus.nlpl.eu) Python3 API

852 18 5
kunansy
rnc

API for Russian National Corpus

819 9 1
nltk
nltkdata

NLTK Data

310 2K 1K
s-lilo
brat-peek

Framework for working with brat-annotated .ann files

239 10 2
acqdiv
acqdiv

Pipeline for the ACQDIV Corpus Database

238 1 3
kunansy
ruscorpora

Links to https://github.com/kunansy/rnc

234 9 1
miweru
vrt-generator

Python class for creating vrt-annotated corpora

209 0 0
edwardseley
lyricscorpora

An unofficial Python API that allows users to create a corpus of lyrical text from their favorite artists and billboard charts

186 18 1
jonathandunn
corpus-similarity

Measure the similarity of text corpora for 74 languages

148 14 3
CyberZHG
wiki-dump-reader

Extract corpora from Wikipedia dumps

120 26 8
miweru
vrt-spacy

creating vrt corpora

79 0 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery