PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Data Warehouse Python Packages

Python packages with the GitHub topic data-warehouse. Sorted by relevance, with stars and monthly downloads.
dlt-hub
dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

7.1M 5K 498
PostHog
hogql-parser

🦔 PostHog is an all-in-one developer platform for building successful products. We offer product analytics, web analytics, session replay, error tracking, feature flags, experimentation, surveys, data warehouse, a CDP, and an AI product assistant to help debug your code, ship features faster, and keep all your usage and customer data in one stack.

1.3M 34K 3K
elementary-data
elementary-data

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

1.2M 2K 214
vmware
quickstart-vdk

One framework to develop, deploy and operate data workflows with Python and SQL.

36K 481 66
sdebruyn
dbt-fabric-samdebruyn

Maintained and extended fork combining dbt-fabric and dbt-fabricspark

8K 9 2
crate
dlt-cratedb

dlt destination adapter for CrateDB

6K 0 0
drt-hub
drt-core

Reverse ETL for the code-first data stack

6K 20 31
Canner
wren-core-py

The open context engine for AI agents support 15+ data sources. Built on Rust and Apache DataFusion.

4K 661 197
vmware
vdk-core

One framework to develop, deploy and operate data workflows with Python and SQL.

4K 481 66
iiasa
ixmp

The ix modeling platform for integrated and cross-cutting scenario analysis

4K 39 115
Canner
wren-engine

The open context engine for AI agents support 15+ data sources. Built on Rust and Apache DataFusion.

3K 661 197
GClunies
reflekt

Define, govern, and model event data for warehouse-first product analytics.

3K 86 4
dlt-hub
dlt-core

dlt is an open-source python-first scalable data loading library that does not require any backend to run.

3K 5K 498
vmware
vdk-jupyterlab-extension

One framework to develop, deploy and operate data workflows with Python and SQL.

2K 481 66
unytics
airbyte-serverless

Airbyte made simple (no UI, no database, no cluster)

2K 196 17
Titan-Systems
titan-core

Titan Core: Snowflake infrastructure as code

2K 480 39
unytics
bigfunctions

Supercharge BigQuery with BigFunctions

2K 758 70
firebolt-db
dbt-firebolt

The dbt adapter for Firebolt

2K 30 11
beneath-hq
beneath

Beneath is a serverless real-time data platform ⚡️

1K 84 10
vmware
vdk-control-cli

VDK Control CLI allows user to create, delete, manage and their Data Jobs in Kubernetes runtime.

1K 481 66
ottogroup
koality

Library for data checks and data quality monitoring based on duckdb.

1K 4 1
vmware
vdk-impala

One framework to develop, deploy and operate data workflows with Python and SQL.

943 481 66
vmware
vdk-trino

Versatile Data Kit SDK plugin provides support for trino database and trino transformation templates.

888 481 66
elementary-data
elementary-lineage

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

782 2K 214
    • Data from PyPI, GitHub, ClickHouse, and BigQuery