88 dependents
Package Description Downloads/month
Python library to interact with Amazon SageMaker Unified Studio 11.4M
Tecton Python SDK 1.8M
API Client for the Materials Project 866K
A Delta Lake reader for Dask 37K
Convert STAC items between JSON, GeoParquet, pgstac, and Delta Lake. 26K
Shared transport-neutral runtime helpers for the Asset Allocation split repos. 17K
Python ETL framework for stream processing, real-time analytics, LLM pipelines, ... 16K
An orchestration platform for the development, production, and observation of da... 13K
Database and Delta storage client library for working with Delta Lake tables 9K
Cacheable big data pipelines 8K
DeltaTorch allows loading training data from DeltaLake tables for training Deep... 6K
Project combining flowfile core (backend) and flowfile_worker (compute offloader... 5K
A declarative data engineering framework for building transparent, traceable pip... 4K
Bioinformatics web-platform visualisation dashboard creation 4K
Helper library for Fabric Python using duckdb, arrow and delta_rs (orchestration... 4K
Modelzone SDK – a slim model training and serving toolkit 3K
3K
Deltalake IO Managers for Dagster with pyarrow and Polars support. 3K
Oversee your lakehouse 3K
A toolkit of operators, hooks and utilities for Apache Airflow 2 and 3 2K
Data lake operations toolkit 2K
Enterprise Data Trust — Chapter 4: Unified Drift Monitoring Application 2K
2K
Inspect Flow is a workflow stack built on Inspect AI that enables research organ... 2K
A MCP Server for querying and downloading satellite imagery from the Planetary C... 2K
Test with compare 2K
🧑🏽‍🚒 Post-Disaster Land Cover Classification. 2K
A powerful SQL shell with GUI interface for data analysis 1K
My package description 1K
Test with compare 1K
A Python ETL library for creating declarative data pipelines. 1K
NoSQL to Delta Lake ingestion with schema enforcement, type coercion, and a dead... 1K
db2ixf is a python package with a CLI that simplifies the parsing and processing... 1K
Datazone Client Package 1K
Add your description here 1K
Library for building blockchain pipelines 975
Spark-free Python utilities for Microsoft Fabric focused on Data Engineering usi... 946
ECHO_modules is a Python package for analyzing US Environmental Protection Agenc... 911
908
Terminal UI and async Python client for browsing Microsoft Fabric workspaces and... 904
`target-s3-delta` is a Singer target for s3-delta, built with the Meltano Singer... 788
A collection of interop, core, and orchestration services for the bclearer frame... 787
A set of easy to use convenient tools for deltalake tables. 746
Using DuckDB and Python as high speed ETL tool 720
Add your description here 710
Intuitive and powerful library for highly reproducible scientific data pipeline 699
Delta Lake table-store capability plugin for Phlo 626
A lightweight, Databricks-style autoloader using Polars and SQLite. 537
A scalable and flexible ETL framework for distributed data processing. 525
520