PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
J535D165
recordlinkage

A powerful and modular toolkit for record linkage and duplicate detection in Python

4.6M 1K 153
moj-analytical-services
splink

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

718K 2K 234
RobinL
fuzzymatcher

Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4

11K 286 60
maxharlow
csvmatch

🔎 Finds fuzzy matches between CSV files

8K 191 21
AI-team-UoA
pyjedai

An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.

2K 93 13
vintasoftware
entity-embed

PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

425 161 16
maxharlow
textmatch

🔎 Finds fuzzy matches between datasets

210 17 0
ihmeuw
person-linkage-case-study

Emulates the methods the US Census Bureau uses to link people across multiple data sources, using open-source software (Splink) and simulated data (from pseudopeople).

148 3 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery