290 dependents
| Package | Description | Downloads/month |
|---|---|---|
| 🦦 weasel: A small and easy workflow system | 27.3M | |
| simple, flexible, offline capable, cloud storage with a Python path-like interfa... | 4.1M | |
| s3path is a pathlib extension for AWS S3 Service | 3.4M | |
| Meltano: the declarative code-first data integration engine that powers your wil... | 1.2M | |
| Command Line Interface for Anyscale | 1.1M | |
| s3pathlib is the python package provides the Pythonic objective oriented program... | 985K | |
| Creates fake JSON files from a JSON schema | 716K | |
| Streaming (and fast!) parser for multipart/form-data written in Cython | 404K | |
| Disaster recovery solution for Amazon Managed Workflows for Apache Airflow (MWAA... | 290K | |
| This is the development home of the workflow management system Snakemake. For ge... | 254K | |
| The Privacy Engineering & Compliance Framework | 90K | |
| Schema, functions and a python library for storing and accessing STAC collection... | 78K | |
| A streaming audio reader, processor, and writer built on top of soundfile, and P... | 66K | |
| Astro SDK allows rapid and clean development of {Extract, Load, Transform} workf... | 56K | |
| ccflow is a collection of tools for workflow configuration, orchestration, and d... | 53K | |
| Bigeye SDK offers developer tools and clients to interact with Bigeye programmat... | 51K | |
| A collection of Python-based 'connectors' that extract metadata from various sou... | 51K | |
| Data and tools for generating and inspecting OLMo pre-training data. | 44K | |
| Contains the ML and non-Azure specific common code associated with running A... | 42K | |
| Toolkit for linearizing PDFs for LLM datasets/training | 40K | |
| Contains ML models, featurizers and scoring code which can either be used with A... | 32K | |
| Used for automatically finding the best machine learning model and its parameter... | 28K | |
| flūmine - Betting trading framework | 26K | |
| Framework for simpler Spark Pipelines | 24K | |
| 3D molecular fingerprints | 21K | |
| The leading data integration platform for ETL / ELT data pipelines from APIs, da... | 19K | |
| Google Ads API Report Fetcher (gaarf) | 18K | |
| Common API for all "second gen" AutoML APIs: Auger.AI, Google Cloud AutoML and A... | 17K | |
| Command Line Interface (CLI) for bulk processing/loading data into RegScale | 13K | |
| Some data analysis tools for working with historical PV solar time-series data s... | 12K | |
| GeoNode is an open source platform that facilitates the creation, sharing, and c... | 11K | |
| Tetrascience Python SDK | 10K | |
| Handles reading queries and writing GarfReport from garf-core package | 10K | |
| Synthetic Data SDK ✨ | 9K | |
| Extracts adhoc queries from the Looker API to S3 | 8K | |
| A multilingual phonemizer combining lexica, NLP, and probabilistic scoring for i... | 8K | |
| Simple functions shared across fsai apps. | 8K | |
| The leading data integration platform for ETL / ELT data pipelines from APIs, da... | 7K | |
| CsvPath Framework is a data preboarding automation library for receiving, valida... | 6K | |
| Pipleline for generating data used in text analytics notebooks. Used by Welfare ... | 6K | |
| The leading data integration platform for ETL / ELT data pipelines from APIs, da... | 6K | |
| 6K | ||
| Python library for working with Music Information Retrieval datasets | 6K | |
| Utilities for analysis of adaptive immune receptor repertoire (AIRR) data | 5K | |
| The leading data integration platform for ETL / ELT data pipelines from APIs, da... | 5K | |
| LOCI static analysis service | 5K | |
| Conforms pandas to "correct" datatypes to ensure data in/out using CSV, JSONL an... | 4K | |
| Clodius is a tool for breaking up large data sets into smaller tiles that can su... | 4K | |
| Genropy framework repository | 4K | |
| ALCF Inference Gateway SDK | 4K |