PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
FlagOpen
flagembedding

Retrieval and Retrieval-augmented LLMs

425K 12K 870
MaartenGr
bertopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

381K 8K 893
neuml
txtai

💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows

46K 12K 806
pdrm83
sent2vec

How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.

12K 135 12
shibing624
text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

4K 5K 427
mozilla-ai
encoderfile

Python bindings for encoderfile.

2K 93 15
goru001
inltk

Natural Language Toolkit for Indian Languages (iNLTK)

1K 840 160
jina-ai
vectordb

The Python VectorDB. Build your vector database from working as a library to scaling as a database in the cloud

1K 650 50
SeanLee97
xmnlp

xmnlp:提供中文分词, 词性标注, 命名体识别,情感分析,文本纠错,文本转拼音,文本摘要,偏旁部首,句子表征及文本相似度计算等功能

1K 1K 187
shangan23
similar-sentences

Similar sentence Prediction with more accurate results with your dataset on top of pertained model. #BERT

982 8 2
HostServer001
jee-data-base

Tool to access and manage 14k+ jee main pyqs with semantic clustering

706 5 0
FlagOpen
lm-cocktail

Retrieval and Retrieval-augmented LLMs

666 12K 870
ndgigliotti
afterthoughts

Sentence-aware embeddings using late chunking with transformers.

371 1 0
goldpulpy
pysentence-similarity

PySentence-Similarity is a tool designed to identify and find similarities between sentences and a base sentence, expressed as a percentage 📊.

369 4 0
princeton-nlp
simcse

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

342 4K 533
robrua
easybert

A Dead Simple BERT API for Python and Java (https://github.com/google-research/bert)

299 178 45
luozhouyang
deepse

**DeepSE**: **Sentence Embeddings** based on Deep Nerual Networks, designed for **PRODUCTION** enviroment!

277 9 1
FlagOpen
c-mteb

Retrieval and Retrieval-augmented LLMs

269 12K 870
TharinduDR
simplests

Unsupervised models for Semantic Textual Similarity

253 99 38
MoleculeTransformers
smiles-featurizers

Extract Molecular SMILES embeddings from language models pre-trained with various objectives architectures.

234 19 1
Susheel-1999
sentence-similarity

Package to calculate the similarity score between two sentences

163 11 1
cui-shaobo
opposite-score

A lightweight toolkit for measuring how “opposite” two texts are when they share the same context.

144 6 1
iarroyof
wisse-sentence

A sentence embedding method based on weighted series

132 9 1
wjddusrb03
commitmind

CommitMind: Semantic search for Git commit history powered by TurboQuant vector compression (ICLR 2026). Search commits by meaning, not just keywords.

115 0 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery