PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
apache
sf-hamilton

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

169K 2K 187
souvik-databricks
dlt-with-debug

A lightweight helper utility which allows developers to do interactive pipeline development by having a unified source code for both DLT run and Non-DLT interactive notebook run.

19K 50 9
dagworks-inc
sf-hamilton-sdk

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

15K 2K 186
apache
apache-hamilton

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

13K 2K 187
dotflow-io
dotflow

🎲 Dotflow turns an idea into flow! — Lightweight Python library for execution pipelines

12K 5 8
dagworks-inc
sf-hamilton-lsp

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

10K 2K 186
dagworks-inc
sf-hamilton-ui

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

10K 2K 186
ebonnal
streamable

sync/async iterable streams for Python

9K 317 6
Zipstack
unstract-sdk

A framework for writing Unstract Tools/Apps

5K 23 1
arrowjet
arrowjet

The fastest way to move data in and out of database.

5K 1 1
Breaka84
spooq

Spooq is a PySpark based helper library for ETL data ingestion pipeline in Data Lakes.

4K 10 2
RADar-AZDelta
rabbit-in-a-blender

An ETL pipeline to transform your EMP data to OMOP.

4K 16 6
MTSWebServices
onetl

One ETL tool to rule them all

3K 87 7
dataplane-app
dataplane

The data engineering library to build robust, reliable and on time data pipelines in Python. Integrates with Dataplane Data Platform.

3K 3 3
tabsdata
tabsdata

Tabsdata is a publish-subscribe (pub/sub) server for tables.

2K 40 0
tabsdata
tabsdata-salesforce

Tabsdata plugin to access Salesforce data.

2K 40 0
tabsdata
tabsdata-mongodb

Tabsdata plugin to access MongoDB data.

2K 40 0
badal-io
gcp-airflow-foundations-dev

Opinionated framework based on Airflow 2.0 for building pipelines to ingest data into a BigQuery data warehouse

2K 15 2
tabsdata
tabsdata-databricks

Tabsdata plugin to access Databricks data.

2K 40 0
tabsdata
tabsdata-snowflake

Tabsdata plugin to access Snowflake data.

2K 40 0
tabsdata
tabsdata-mssql

Tabsdata plugin to access Microsoft SQL Server data.

2K 40 0
AppDevIQ
datascreeniq

Real-time data quality screening API — PASS / WARN / BLOCK in milliseconds

2K 1 0
tabsdata
tabsdata-bigquery

Tabsdata plugin to access BigQuery data.

1K 40 0
LostMa-ERC
heurist-api

No description available

1K 4 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery