518 dependents
Package Description Downloads/month
Python library to interact with Amazon SageMaker Unified Studio 11.4M
dlt-hub dlt
data load tool (dlt) is an open source Python library that makes data loading ea... 7.5M
Lightweight SQL execution wrapper only on top of Databricks SDK 5.2M
A modern, enterprise-ready business intelligence web application 3.8M
Soda core library & CLI 3.6M
An orchestration platform for the development, production, and observation of da... 2.1M
the portable Python dataframe library 1.9M
Accelerates migrations to Databricks by automating key migration activities 1.5M
Collate SQL Lineage for Analysis Tool powered by Python and sqlfluff based on sq... 1.1M
Enforce Data Contracts 1M
Fast, accurate and scalable probabilistic data linkage with support for multiple... 718K
This project helps us to run Data Quality Rules in flight while spark job is bei... 523K
Scalable and efficient data transformation framework - backwards compatible with... 508K
💻 A fully functional local AWS cloud stack. Develop and test your cloud & Server... 504K
PyGWalker: Turn your dataframe into an interactive UI for visual analysis 339K
MetricFlow allows you to define, build, and maintain metrics in code. 333K
dlthub is a commercial extension to dlt 324K
MCP Server for Snowflake including Cortex AI, object management, SQL orchestrati... 234K
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes dat... 180K
Kolo is a text-based Python debugger for AI agents. Capture every executed funct... 162K
Python-based SDK designed for interacting with VAST Database & VAST Catalog, ena... 148K
A python implementation of the mysql server protocol 137K
Better SQL in Jupyter. 📊 134K
A Terminal Client for MySQL with AutoCompletion and Syntax Highlighting. 128K
A Python SDK for OceanBase Multimodal Store—enabling vector search, full-text se... 122K
Datailot-cli is the command line interface for accessing the AI teammate for eng... 116K
Run, mock and test fake Snowflake databases locally. 111K
The Privacy Engineering & Compliance Framework 90K
Parrot is a new library to work with chatbots and agents with a very simple API 86K
84K
🧙 Build, run, and manage data pipelines for integrating and transforming data. 84K
A column lineage parser and dashboarding tool 74K
Turning PySpark Into a Universal DataFrame API 72K
ingestr is a CLI tool to copy data between any databases with a single command s... 69K
Snowpark Connect for Spark 59K
The data-validation toolkit for enhanced dbt (data build tool) PR review 48K
The open source metrics layer 47K
The data-validation toolkit for enhanced dbt (data build tool) PR review 42K
Chronon python API library 38K
35K
RelationalAI Library and CLI 31K
A Validation and Transformation Language engine (VTL), written in Python. Compat... 29K
Python backend for Turntable 29K
A Query Mapper for Python 27K
dbt plugin that skips redundant model executions by caching results from previou... 27K
27K
Illuminate your data. 26K
PostgreSQL schema evolution with built-in multi-agent coordination 🍓 25K
A framework for multisite longitudinal clinical trials built on Django 24K
Bespoke SQL linter and formatter for DuckDB, powered by sqlglot 23K