PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
explosion
spacy

💫 Industrial-strength Natural Language Processing (NLP) in Python

21.9M 34K 5K
Microsoft
presidio-analyzer

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

4.5M 8K 1K
Microsoft
presidio-anonymizer

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

3.3M 8K 1K
JohnSnowLabs
spark-nlp

State of the Art Natural Language Processing

1.1M 4K 743
stanfordnlp
stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

942K 8K 942
chakki-works
seqeval

A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)

835K 1K 131
urchade
gliner

Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts)

556K 3K 271
flairNLP
flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

171K 14K 2K
fastdatascience
drug-named-entity-recognition

Drug Named Entity Recognition library to find and resolve drug names in a string (drug named entity linking)

152K 32 14
ThilinaRajapakse
simpletransformers

Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI

73K 4K 717
modelscope
adaseq

AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence Understanding Models

53K 453 44
MantisAI
nervaluate

Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13

35K 213 27
Microsoft
presidio-image-redactor

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

30K 8K 1K
hankcs
hanlp

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

24K 36K 11K
CAMeL-Lab
camel-tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

24K 548 89
microsoft
presidio-structured

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

14K 8K 1K
s14t284
torchcrf

An Inplementation of CRF (Conditional Random Fields) in PyTorch 1.0

13K 137 12
Georgetown-IR-Lab
medspacy-quickumls

System for Medical Concept Extraction and Linking

13K 442 102
explosion
spacy-llm

🦙 Integrating LLMs into structured NLP pipelines

10K 1K 105
hankcs
hanlp-common

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

9K 36K 11K
hankcs
hanlp-trie

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

9K 36K 11K
Microsoft
presidio

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

7K 8K 1K
sagorbrur
bnlp-toolkit

BNLP is a natural language processing toolkit for Bengali Language.

5K 307 67
arnebinder
pytorch-ie

PyTorch-IE: State-of-the-art Information Extraction in PyTorch

5K 77 6
    • Data from PyPI, GitHub, ClickHouse, and BigQuery