PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
MaartenGr
polyfuzz

Fuzzy string matching, grouping, and evaluation.

51K 794 72
artitw
text2text

Text2Text Language Modeling Toolkit

6K 304 41
AmenRa
retriv

A Python Search Engine for Humans 🥸

3K 250 33
Kensuke-Mitsuzawa
documentfeatureselection

A set of metrics for feature selection from text data

1K 45 12
rth
vtext

Simple NLP in Rust with Python bindings

726 153 9
anyks
anyks-sc

ANYKS Spell-Checker

628 19 4
daedalus
mcp-external-memory

An MCP server that gives LLMs persistent, searchable semantic memory

623 0 0
adobe
stringlifier

Python module for detecting password, api keys hashes and any other string that resembles a randomly generated character sequence.

562 169 27
jamalrahman
hybridtfidf

An implementation of the Hybrid TF-IDF microblog summarisation algorithm as proposed by David Ionuye and Jugal K. Kalitaß.

522 4 2
ina-foss
twembeddings

event detection in tweets

495 33 5
eea
eea-similarity

eea.similarity

422 1 2
dayyass
text-classification-baseline

Pipeline for fast building text classification TF-IDF + LogReg baselines.

383 61 4
adobe
stringlifier39

Stringlifier is on Opensource ML Library for detecting random strings in raw text. It can be used in sanitising logs, detecting accidentally exposed credentials and as a pre-processing step in unsupervised ML-based analysis of application text data.

275 169 27
davidsbatista
snowball-extractor

Implementation with some extensions of the paper "Snowball: Extracting Relations from Large Plain-Text Collections" (Agichtein and Gravano, 2000)

226 178 39
klauscfhq
moviebox

Machine learning movie recommending system

220 529 54
Nikolay-Lysenko
readingbricks

A structured collection of notes (mostly, on machine learning) and a Flask app for reading and searching them.

198 95 11
aeturrell
occupationcoder

A tool to use job text, such as job description, to assign standard occupational classification codes.

180 75 29
r-m-n
sklearn-deltatfidf

DeltaTfidfVectorizer for scikit-learn

163 10 2
pelican-plugins
pelican-similar-posts

Pelican plugin to list similar posts to articles, based on a vector space model.

149 20 3
abdullahselek
koolsla

Food recommendation tool with Machine learning

135 21 4
juliuste
tfidfde

German tf-idf module.

123 5 2
textvec
textvec

Supervised text features extraction

120 197 26
AvishrantsSh
pyranker

Python based package consisiting several Rankers for Information Retrieval

90 0 0
ArnoldGaius
tf-idf-categoryweighting

Tf-Idf-CategoryWeighting

74 3 2
    • Data from PyPI, GitHub, ClickHouse, and BigQuery