PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
mammothb
symspellpy

Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm

420K 869 126
lancopku
pkuseg

pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation

14K 7K 985
dongrixinyu
jiojio

A convenient Chinese word segmentation tool 简便中文分词器

8K 50 8
hankcs
pyhanlp

中文分词

4K 3K 799
howl-anderson
microtokenizer

一个轻量且功能全面的中文分词器,帮助学生了解分词器的工作原理。MicroTokenizer: A lightweight Chinese tokenizer designed for educational and research purposes. Provides a practical, hands-on approach to understanding NLP concepts, featuring multiple tokenization algorithms and customizable models. Ideal for students, researchers, and NLP enthusiasts..

4K 159 22
baidu
lac

百度NLP:分词,词性标注,命名实体识别,词重要性

2K 4K 590
messense
cjieba

Python cffi binding to CppJieba

755 15 0
dhchenx
ner-kit

Named Entity Recognition Toolkit

642 0 0
monpa-team
monpa

MONPA is an end-to-end model to jointly conduct Chinese word segmentation, POS and NE labeling

505 247 25
voidism
pywordseg

Open Source State-of-the-art Chinese Word Segmentation System with BiLSTM and ELMo. https://arxiv.org/abs/1901.05816

459 46 6
ownthink
jiagu

Jiagu深度学习自然语言处理工具 知识图谱关系抽取 中文分词 词性标注 命名实体识别 情感分析 新词发现 关键词 文本摘要 文本聚类

369 3K 609
Kyubyong
g2pc

g2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese

362 245 32
notoriouslab
trad-zh-search

Traditional Chinese text preprocessing for search engines — CKIP segmentation + bigram indexing with pluggable domain dictionaries

266 18 1
kemingy
handict

Yet another word segmentation tool.

186 1 0
Ailln
simjb

✂️用 100 行实现简单版本的 jieba 分词

140 3 1
hankcs
yyhanlp

中文分词

59 3K 799
    • Data from PyPI, GitHub, ClickHouse, and BigQuery