PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Etl Framework Python Packages

Python packages with the GitHub topic etl-framework. Sorted by relevance, with stars and monthly downloads.
apache
sf-hamilton

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

172K 2K 187
apache
apache-hamilton

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

16K 2K 187
dagworks-inc
sf-hamilton-sdk

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

15K 2K 186
pathwaycom
pathway

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

15K 63K 2K
legout
flowerpower

Simple Workflow Framework based on Hamilton

13K 24 1
dagworks-inc
sf-hamilton-ui

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

11K 2K 186
dotflow-io
dotflow

🎲 Dotflow turns an idea into flow! — Lightweight Python library for execution pipelines

11K 5 8
dagworks-inc
sf-hamilton-lsp

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

10K 2K 186
Mmodarre
lakehouse-plumber

The Metadata Driven framework for Databricks Lakeflow Declarative Pipelines (formerly Delta Live Tables). Metadata framework that generates production ready Pyspark code for Lakeflow Declarative Pipelines

5K 56 11
amsdal
amsdal-glue-connections

A Python library for querying multiple databases simultaneously through a unified interface, enabling data virtualization without moving data.

3K 4 0
quintoandar
butterfree

A tool for building feature stores.

3K 318 38
amsdal
amsdal-glue-core

A Python library for querying multiple databases simultaneously through a unified interface, enabling data virtualization without moving data.

2K 4 0
socialpoint-labs
sqlbucket

Lightweight library to write, orchestrate and test your SQL ETL. Writing ETL with data integrity in mind.

2K 74 9
amsdal
amsdal-glue

A Python library for querying multiple databases simultaneously through a unified interface, enabling data virtualization without moving data.

2K 4 0
dataforgelabs
dataforge-core

DataForge helps data teams write functional transformation pipelines by leveraging software engineering principles

1K 59 2
usc-isi-i2
kgtk

Knowledge Graph Toolkit

1K 418 62
RLado
canonada

Canonada is a data science framework that helps you build production-ready streaming pipelines for data processing in Python

1K 1 2
crate
cratedb-fivetran-destination

CrateDB Fivetran Destination

894 0 0
cpzt
parade-manage

A manage module for parade

817 0 0
amsdal
amsdal-glue-sql-parser

AMSDAL Glue is a Python interface providing high-level abstraction for interacting with multiple databases simultaneously, simplifying the development and maintenance process.

624 4 0
ContextData
vector-etl

Lightweight ETL Framework for Vector Databases

620 108 19
geopython
stetl

Stetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.

542 88 33
pyprogrammerblog
tiny-blocks

Tiny Block Operations for Data Pipelines

539 3 0
dagworks-inc
sf-hamilton-contrib

Hamilton's user contributed shared dataflow library.

481 2K 186
    • Data from PyPI, GitHub, ClickHouse, and BigQuery