PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Data Lineage Python Packages

Python packages with the GitHub topic data-lineage. Sorted by relevance, with stars and monthly downloads.
reata
sqllineage

SQL Lineage Analysis Tool powered by Python

1.7M 2K 276
elementary-data
elementary-data

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

1.2M 2K 214
open-metadata
openmetadata-ingestion

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

391K 14K 2K
laminlabs
lamindb

Open-source data framework for biology. Context and memory for datasets and models at scale. Query, trace & validate with a lineage-native lakehouse that supports bio-formats, registries & ontologies. 🍊YC S22

101K 260 24
open-metadata
openmetadata-managed-apis

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

39K 14K 2K
vmware
quickstart-vdk

One framework to develop, deploy and operate data workflows with Python and SQL.

36K 481 66
datajoint
datajoint

Relational data pipelines for the science lab

21K 192 96
laminlabs
lamindb-core

Open-source data framework for biology. Context and memory for datasets and models at scale. Query, trace & validate with a lineage-native lakehouse that supports bio-formats, registries & ontologies. 🍊YC S22

21K 260 24
rocky-data
dagster-rocky

The trust system for your data. Rust-based control plane for warehouse pipelines — branches, replay, column-level lineage, compile-time safety, per-model cost attribution. Keep Databricks or Snowflake. Bring Rocky for the DAG.

14K 198 4
vmware
vdk-core

One framework to develop, deploy and operate data workflows with Python and SQL.

4K 481 66
open-metadata
openmetadata-airflow-managed-apis

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

3K 14K 2K
manasdutta04
openblame

Local-first AI investigation CLI for OpenMetadata data pipelines.

3K 2 0
grai-io
grai-client

No description available

2K 314 20
data-drift
driftdb

Historical metric store

2K 331 12
grai-io
grai-schemas

No description available

2K 314 20
vmware
vdk-jupyterlab-extension

One framework to develop, deploy and operate data workflows with Python and SQL.

2K 481 66
grai-io
grai-source-dbt

No description available

2K 314 20
PranavMotarwar
raglineage

Lineage-aware RAG engine for auditable, reproducible, versioned retrieval and answers

2K 2 0
kishanraj41
autolineage

Automatic ML data lineage tracking — zero manual logging

1K 3 0
grai-io
grai-source-bigquery

No description available

1K 314 20
grai-io
grai-source-snowflake

No description available

1K 314 20
vmware
vdk-control-cli

VDK Control CLI allows user to create, delete, manage and their Data Jobs in Kubernetes runtime.

1K 481 66
grai-io
grai-source-postgres

No description available

1K 314 20
data-drift
datagit

Metrics Observability & Troubleshooting

1K 331 12
    • Data from PyPI, GitHub, ClickHouse, and BigQuery