33 dependents
Package Description Downloads/month
pytest plugin to run the tests with support of pyspark 392K
HandySpark - bringing pandas-like capabilities to Spark dataframes 12K
A cluster computing framework for processing large-scale geospatial data 6K
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask... 6K
Base PySpark application for running Merlin prediction batch job 5K
Program Agnostic Data Ecosystem (PADE) - Python Services 4K
DQ Solution to answer all DQ needs. 3K
Shared Python functions for OLX Business Intelligence Team 3K
Infrastructure for AI applications and machine learning pipelines 1K
Spark Structured Data Sampling and Data Evaluation Components 929
A sample project that exists for PyPUG's "Tutorial on Packaging and Distributing... 731
A lightweight package for running poetry projects on emr. 726
Configuracao da infra estrutura utiizando docker compose 643
633
Sparglim✨ makes PySpark App Configurable and Deploy Spark Connect Server Easier! 618
nose2 plugin to run the tests with support of pyspark. 531
Seamlessly run Pyspark code on remote clusters. 480
A command line tool to allow the testing of datasets 477
DQ Solution to answer all DQ needs. 454
A placeholder description for your python project 324
Easy Cross-platform Installation and Configuration of Apps. 313
Marvin AI has been accepted into the Apache Foundation and is now available at h... 291
Anomaly detection bridge 287
A unified end-to-end machine intelligence platform 179
Program Agnostic Data Ecosystem (PADE) - Python Services 176
Create and configure a Spark session with optimized settings 156
128
DQ Solution to answer all DQ needs. 122
Marvin Python Common Library 91
A tool to help building ML pipeline easier for non technical users.. 66
spark_quality_rules_tools 42
spark_dataframe_tools 7
1