68 dependents
Package Description Downloads/month
Apache Airflow - A platform to programmatically author, schedule, and monitor wo... 2M
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get st... 519K
Azure plugin for dvc 106K
The open source research environment for AI researchers to seamlessly train, eva... 54K
GQLAlchemy is a library developed with the purpose of assisting in writing and r... 52K
Data Memory: the operational data context layer for AI agents - typed, versioned... 46K
YData allows to use the *Data-Centric* tools from the YData ecosystem to acceler... 29K
Go ahead and axolotl questions 20K
Generic functions 7K
Runtime support library for Chalk AI 6K
CDC Data Hub Lifecycle, Analysis and Visualization Accelerator (LAVA) makes buil... 5K
4K
various utils, functions, patches that I like 3K
A simple wrapper process around cloud service providers to run tools for the RAP... 3K
Climate downscaling using CMIP6 data 3K
World's most powerful open data catalog for building a high-performance, geo-dis... 2K
DVCx 2K
DQL 2K
A MCP Server for querying and downloading satellite imagery from the Planetary C... 2K
A library for working with Flywheel datasets 2K
🧑🏽‍🚒 Post-Disaster Land Cover Classification. 2K
ETL tool for data processing 1K
MARS-S2L Plume detection, segmentation and quantification of methane plumes in S... 1K
Kedro plugin with Azure ML Pipelines support 1K
Shared runtime utilities for dlt pipelines in VD Studio — AKV secrets, OneLake d... 1K
Supercharged fsspec filesystem classes with higher-level functions and utilitari... 1K
Python library for interacting with Spark, Azure, Minio, and other data sources. 1K
Retrieve National Water Model data from various sources. 1K
Cledar Python SDK 1K
Data Scientist platform 919
Utilidades para interactuar de manera más sencilla con Azure Data Lake Gen2. 882
NVIDIA's package for core modules common across TAO Toolkit DNNs. 739
An MCP Server for STAC requests 729
OpenSSA: Small Specialist Agents based on Domain-Aware Neurosymbolic Agent (DANA... 646
A Fabric extension for managing data lakes 602
Handy tools for data access 588
A library used to fetch data from deltalake tables locally. 587
Open-source PySpark toolkit with data sources for REST APIs and SPARQL endpoints... 579
Production-oriented Python library for Microsoft Fabric OneLake Lakehouse access 532
siem query utils nbdev edition 491
Production-oriented Python library for Microsoft Fabric OneLake Lakehouse access 458
A multi-format and multi-storage xarray engine with automatic engine detection, ... 405
Handy tools for datalake access 380
Retrieve National Water Model data from various sources. 361
Performant iterators for loading files from S3, GCS and Azure into memory for ea... 352
This is a kit that provides the ability to read and write trajectory data in the... 279
254
Overture Maps Downloader simplifies geospatial data manipulation 251
Flexible and scalable framework for data input and output operations in Spark ap... 243
Python SIEM Query Utils nbdev edition 202