PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Data Engineering Pipeline Python Packages

Python packages with the GitHub topic data-engineering-pipeline. Sorted by relevance, with stars and monthly downloads.
vmware
quickstart-vdk

One framework to develop, deploy and operate data workflows with Python and SQL.

36K 481 66
vmware
vdk-core

One framework to develop, deploy and operate data workflows with Python and SQL.

4K 481 66
edrewitz
wxdata

A Python package of end-to-end weather data clients & raw data clients with VPN/PROXY support, data processors that decode variable keys from GRIB format into a plain-language format & various tools for assisting Python automated workflows, querying meteorological datasets and filling gaps in meteorological data.

2K 23 1
vmware
vdk-jupyterlab-extension

One framework to develop, deploy and operate data workflows with Python and SQL.

2K 481 66
vmware
vdk-control-cli

VDK Control CLI allows user to create, delete, manage and their Data Jobs in Kubernetes runtime.

1K 481 66
anki-code
xontrib-pipeliner

Let your pipe lines flow thru the Python code in xonsh.

1K 62 4
ketgo
marshmallow-pyspark

PySpark data serializer

993 12 4
TJhon
perustats

Tools to download and process public datasets from Peru (INEI & BCRP).

971 5 1
vmware
vdk-impala

One framework to develop, deploy and operate data workflows with Python and SQL.

943 481 66
vmware
vdk-trino

Versatile Data Kit SDK plugin provides support for trino database and trino transformation templates.

888 481 66
vmware
vdk-heartbeat

One framework to develop, deploy and operate data workflows with Python and SQL.

652 481 66
vmware
vdk-server

Versatile Data Kit SDK plugin that facilitates the installation of a local Control Service.

642 481 66
vmware
vdk-test-utils

One framework to develop, deploy and operate data workflows with Python and SQL.

620 481 66
vmware
vdk-plugin-control-cli

Versatile Data Kit SDK plugin exposing CLI commands for managing the lifecycle of a Data Jobs.

579 481 66
pyprogrammerblog
tiny-blocks

Tiny Block Operations for Data Pipelines

539 3 0
vmware
airflow-provider-vdk

Airflow provider for Versatile Data Kit.

523 481 66
vmware
vdk-ingest-http

Versatile Data Kit SDK ingestion plugin to ingest data via http requests.

504 481 66
vmware
vdk-lineage-model

One framework to develop, deploy and operate data workflows with Python and SQL.

466 481 66
vmware
vdk-kerberos-auth

One framework to develop, deploy and operate data workflows with Python and SQL.

454 481 66
vmware
vdk-control-api-auth

Versatile Data Kit plugin library provides support for authentication.

450 481 66
vmware
vdk-postgres

Versatile Data Kit SDK plugin provides support for PostgreSQL database and postgres transformation templates.

410 481 66
iTrauco
pybro-cli

yo, it's ya boy, pybro! 😎 | a personal collection of python hacks for 24.04 debian

373 0 0
vmware
vdk-csv

Versatile Data Kit SDK CSV plugin to ingest, export, or manipulate csv files.

349 481 66
vmware
vdk-oracle

Support for VDK Managed Oracle connection

347 481 66
    • Data from PyPI, GitHub, ClickHouse, and BigQuery