PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Data Pipeline Python Packages

Python packages with the GitHub topic data-pipeline. Sorted by relevance, with stars and monthly downloads.
olirice
flupy

Fluent data pipelines for python and your shell

1.3M 195 15
elementary-data
elementary-data

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

1.2M 2K 214
pydoit
doit

CLI task management & automation tool

770K 2K 190
airbytehq
airbyte-source-declarative-manifest

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

326K 21K 5K
pipeline-tools
gusty

Making DAG construction easier

73K 285 13
bruin-data
ingestr

ingestr is a CLI tool to copy data between any databases with a single command seamlessly.

68K 3K 120
airbytehq
airbyte-source-facebook-marketing

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

24K 21K 5K
InfuseAI
piperider-nightly

Code review for data in dbt

22K 494 23
airbytehq
airbyte-source-google-ads

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

22K 21K 5K
xorq-labs
xorq

Composable expressions for data

20K 507 28
airbytehq
airbyte-source-s3

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

20K 21K 5K
niamoto
niamoto

Biodiversity publishing tool for ecological data, from import to static portal generation.

16K 4 0
airbytehq
airbyte-source-salesforce

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

15K 21K 5K
airbytehq
airbyte-source-github

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

14K 21K 5K
rocky-data
dagster-rocky

The trust system for your data. Rust-based control plane for warehouse pipelines — branches, replay, column-level lineage, compile-time safety, per-model cost attribution. Keep Databricks or Snowflake. Bring Rocky for the DAG.

14K 198 4
airbytehq
airbyte-source-shopify

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

12K 21K 5K
airbytehq
airbyte-source-google-sheets

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

11K 21K 5K
airbytehq
airbyte-source-zendesk-support

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

11K 21K 5K
airbytehq
airbyte-source-google-drive

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

10K 21K 5K
airbytehq
airbyte-source-faker

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

9K 21K 5K
airbytehq
airbyte-source-bing-ads

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

8K 21K 5K
airbytehq
airbyte-source-marketo

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

8K 21K 5K
airbytehq
airbyte-source-gcs

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

8K 21K 5K
airbytehq
airbyte-source-google-analytics-data-api

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

8K 21K 5K
    • Data from PyPI, GitHub, ClickHouse, and BigQuery