PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Entity Resolution Python Packages

Python packages with the GitHub topic entity-resolution. Sorted by relevance, with stars and monthly downloads.
J535D165
recordlinkage

A powerful and modular toolkit for record linkage and duplicate detection in Python

4.6M 1K 153
moj-analytical-services
splink

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

740K 2K 234
dedupeio
dedupe

:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

100K 4K 569
maxharlow
csvmatch

πŸ”Ž Finds fuzzy matches between CSV files

8K 191 21
data61
anonlink

Python implementation of anonymous linkage using cryptographic linkage keys

7K 74 8
benzsevern
goldenmatch

🟑 Golden Suite β€” polyglot data-quality + entity-resolution toolkit. GoldenCheck profiles β†’ GoldenFlow standardizes β†’ GoldenMatch dedupes β†’ GoldenPipe orchestrates. Zero-config defaults, 97% F1, MCP server per package + one master, multi-arch container images, drop-in Airflow DAGs.

5K 36 5
zinggAI
zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

5K 1K 165
cangyuanli
floof

Fuzzymatching made easy

5K 5 0
Picovoice
pvrhino

On-device Speech-to-Intent engine powered by deep learning

4K 700 95
SkyeAv
tablassert

Extract knowledge assertions from tabular data into NCATS Translator-compliant KGX NDJSON β€” declaratively, with entity resolution and quality control built in.

3K 5 0
AI-team-UoA
pyjedai

An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.

2K 93 13
Org-EthereaLogic
etherealogic-aetheriaforge

Databricks-native intelligent data transformation engine β€” coherence-scored Bronze/Silver/Gold with entity resolution and temporal reconciliation in a single deployable product.

2K 1 0
fritshermans
deduplipy

Python package for deduplication/entity resolution using active learning

2K 82 8
raphschlatt
ads-and

NAND-based author name disambiguation for SAO/NASA ADS publication metadata

2K 1 0
Picovoice
pvrhinodemo

On-device Speech-to-Intent engine powered by deep learning

2K 700 95
pmart123
cymbology

financial identifier validation.

1K 15 1
DerwenAI
strwythura

Strwythura: construct an entity-resolved knowledge graph from structured data sources and unstructured content sources, implementing an ontology pipeline, plus context engineering for optimizing AI application outcomes within a specific domain. This produces a Streamlit app, with MLOps instrumentation.

1K 223 25
ihmeuw
easylink

A tool that allows users to build and run highly configurable record linkage/entity resolution pipelines.

1K 11 0
NickCrews
mismo

The SQL/Ibis powered sklearn of record linkage

1K 23 4
usc-isi-i2
rltk

Record Linkage ToolKit (Find and link entities)

679 111 22
databricks-industry-solutions
databricks-arc

ARC: data linking solution for Databricks with Splink

674 53 22
ADBond
splinkclickhouse

Allows Clickhouse to be used as the execution engine for Splink

613 6 0
dobraczka
kiez

🏘️ Hubness reduced nearest neighbor search for entity alignment with knowledge graph embeddings

520 29 3
vintasoftware
entity-embed

PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

470 161 16
    • Data from PyPI, GitHub, ClickHouse, and BigQuery