PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
Microsoft
presidio-analyzer

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

4.6M 8K 1K
Microsoft
presidio-anonymizer

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

3.4M 8K 1K
berislavlopac
sanitary

Utility to remove or replace sensitive data from complex structures.

97K 3 1
datafog
datafog

Python SDK for PII detection and redaction in text and images, combining regex + NLP pipelines for production privacy workflows.

54K 54 13
Microsoft
presidio-image-redactor

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

34K 8K 1K
capitalone
dataprofiler

What's in your data? Extract schema, statistics and entities from datasets

32K 2K 186
microsoft
presidio-structured

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

14K 8K 1K
armurox
loggingredactor

Logging Redactor is a Python library designed to redact sensitive data in logs based on regex patterns and / or dictionary keys.

11K 6 0
opendsr-std
seedfaker

Deterministic synthetic data generator for realistic, correlated, and noisy test records across 68 locales. Rust CLI/Python/Node.js/Browser WASM/Go/PHP/Ruby/MCP

9K 23 0
Microsoft
presidio

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

7K 8K 1K
cloakllm
cloakllm

Python SDK — PII cloaking middleware for LLM calls (spaCy NER + regex + Ollama)

7K 0 0
kylemclaren
scrubadubdub

A Python package to scrub PII

6K 25 8
solentlabs
har-capture

Capture and sanitize HAR (HTTP Archive) files with deep PII removal. Perfect for support diagnostics, security reviews, and test fixtures.

5K 3 0
cloakllm
cloakllm-mcp

MCP server — CloakLLM tools for Claude Desktop and MCP clients

4K 0 0
STHITAPRAJNAS
ghost-pii-pydantic

Automatic PII redaction for Pydantic v2 — masks sensitive data in logs and print statements. GDPR/HIPAA-friendly.

3K 0 0
seanpedrick-case
doc-redaction

Redact PDF/image-based documents, Word, or CSV/XLSX files using a graphical user interface. Demo: https://huggingface.co/spaces/seanpedrickcase/document_redaction or with try with VLMs: https://huggingface.co/spaces/seanpedrickcase/document_redaction_vlm

2K 50 10
EdyVision
pii-codex

A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)

2K 98 11
rohitcoder
hawk-scanner

A powerful scanner to scan your Filesystem, S3, MySQL, Redis, Google Cloud Storage and Firebase storage for PII and sensitive data.

2K 484 51
tokern
piicatcher

Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub

2K 340 97
parvathirajan
the-mask

A package to hide/mask PII information in the JSON object

1K 2 1
Tatarinho
llm-safe-pl

[DEPRECATED — use pii-toolkit] Reversible Polish PII anonymization for LLM workflows. Successor packages: pii-veil, pii-core, pii-presidio.

1K 2 0
nextaim-de
noirdoc

German-first PII redaction and pseudonymization for documents. Local by default. Reversible when you need it.

1K 3 0
LokaalHub
filenthropist

Local-first file scanner and labeler that lets you work safely with autonomous AI agents. Multilingual, currently focused on Dutch PII and GDPR.

773 1 0
daedalus
pii-safe

Redact PII from text

685 0 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery