100 dependents
Package Description Downloads/month
Apache Airflow - A platform to programmatically author, schedule, and monitor wo... 14.6M
An open-source, code-first Python toolkit for building, evaluating, and deployin... 7M
This library has moved to https://github.com/googleapis/google-cloud-python/tree... 982K
PyAirbyte 388K
ingestr is a CLI tool to copy data between any databases with a single command s... 69K
Dask + BigQuery integration 38K
YData allows to use the *Data-Centric* tools from the YData ecosystem to acceler... 29K
Codes used for Amazon seller report cleaning and API pulls. 23K
A package to enable easy data validation 17K
⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: ht... 17K
This package holds the Bigquery plugins for flytekit 11K
Python projects for Carol 10K
"Document AI repo for data science" 10K
CLI tool for the Zipline AI platform 8K
Biblioteca do time CSC-CIA utilizada no desenvolvimento de RPAs 7K
sktaiflow skt
SKT package 6K
An extension library to write to and read from BigQuery tables as PyArrow tables... 4K
Utilities to simplify connection with Google APIs 4K
A library for Mozilla experiments analysis 3K
BigQuery plugin for flyte 3K
Define, govern, and model event data for warehouse-first product analytics. 3K
Modern Data Centric AI system for Large Language Models 3K
A minimalist alternative to dbt 3K
Data Science Common Core 2K
Library that facilitates the creation of semantic layer metadata graph in Neo4j. 2K
CLI tool for the Zipline AI platform 2K
TDD engine for Analytics Engineering — generate unit test data for SQL queries 2K
2K
Scan databases and data warehouses for PII data. Tag tables and columns in data ... 2K
brought to you by the dreamslabs discord community 2K
Test with compare 2K
Essential Python toolkit for Deepnote environments 2K
Solution for DS Team 1K
A DataFrame API for Google BigQuery 1K
Common utilities shared with the community by Datwave AI Team 1K
A Harlequin adapter for Google BigQuery. 1K
Redis cache which is optimized to sit in front of bigquery 1K
Easy Data Preparation with latest LLMs-based Operators and Pipelines. 1K
BuildFlow, is an open source framework for building large scale systems using Py... 1K
Machine Learning for Health python package 1K
LLMstudio Tracker is the module of LLMstudio that allows monitoring and logging ... 1K
zsvoboda dbd
dbd is a database prototyping tool that enables data analysts and engineers to q... 1K
Package for work with different DB's by a simple interface 1K
A collection of functions for use in the statistics and machine learning pipelin... 984
908
Semi-supervised Pseudo Labeler Anomaly Detection with Ensembling (SPADE) is a se... 890
A tool to compare data from different sources. 884
zbq
A lightweight, wrapper around Google Cloud BigQuery with Polars integration. Sim... 813
Facades and common functions necessary for data science and data engineering wor... 706
Toolkit for interacting with Google BigQuery and CMAP datasets 655