PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
yu9824
kennard-stone

This is an algorithm for evenly partitioning.

2K 12 1
burning-cost
insurance-cv

Temporal and distributional cross-validation, and feature screening, for insurance pricing models

539 0 0
graph-part
graph-part

Graph-based partitioning of biological sequence data

511 35 6
ODAncona
bboxconverter

This library allows reading and converting bounding box annotations in many popular formats

448 27 1
marmurar
jano

Temporal partitioning and backtesting utilities for time-correlated datasets.

396 2 1
michaelscutari
protclust

protclust is a Python library for protein sequence analysis that integrates MMseqs2 for fast clustering and provides tools for creating robust machine learning datasets. It offers cluster-aware data splitting to prevent sequence similarity bias in model evaluation, along with comprehensive protein embedding capabilities for feature generation.

372 4 0
maksymsur
spltr

A simple PyTorch-based data loader and splitter

155 1 0
bharatadk
python-splitter

📁 Repo for python_splitter Python package. This package can split Images into Train, Test, Validation folders automatically by shuffling media/images for machine learning.

142 12 4
emilelampe
maestros

A package for splitting multilabel datasets into train and test sets, while preserving the distribution of labels and keeping samples from the same group together. Includes a report and chart for visualizing the stratification.

141 1 0
michaelscutari
mmseqspy

Python utilities for protein sequence clustering and dataset splitting with MMseqs2

75 4 0
ODAncona
bboxtools-2

This library allows reading and converting bounding box annotations in many popular formats

57 27 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery