PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
dlt-hub
dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

7.3M 5K 498
treeverse
lakefs-sdk

lakeFS - Data version control for your data lake | Git for data

1.1M 5K 446
treeverse
lakefs

lakeFS - Data version control for your data lake | Git for data

926K 5K 446
treeverse
lakefs-client

lakeFS - Data version control for your data lake | Git for data

204K 5K 446
MatsMoll
aligned

The DBT of ML, as Aligned describes data dependencies in ML systems, and reduce technical data debt

9K 61 2
crate
dlt-cratedb

dlt destination adapter for CrateDB

6K 0 0
nodestream-proj
nodestream

A Declarative framework for Building, Maintaining, and Analyzing Graph Data

4K 61 17
Canner
wren-core-py

The open context engine for AI agents support 15+ data sources. Built on Rust and Apache DataFusion.

4K 661 197
Canner
wren-engine

The open context engine for AI agents support 15+ data sources. Built on Rust and Apache DataFusion.

4K 661 197
dlt-hub
dlt-core

dlt is an open-source python-first scalable data loading library that does not require any backend to run.

3K 5K 498
mag1cfrog
timeseries-table-format

Append-only time-series table format with gap/overlap tracking (Python bindings).

1K 12 1
nodestream-proj
nodestream-plugin-dotenv

A plugin to nodestream for loading environment variables from a .env file

663 2 0
arpe-io
lakexpress-mcp

A Model Context Protocol (MCP) server for LakeXpress, enabling database to Parquet export with sync management and data lake publishing.

396 0 0
nodestream-proj
nodestream-plugin-meta

A plugin to nodestream for building a graph of the schema of the graph.

366 0 0
nodestream-proj
nodestream-plugin-semantic

A plugin for embedding semantic data into a nodestream project

359 0 0
utndatasystems
virtual-parquet

🗜️Compressing Parquet files using functions (TRL @NeurIPS'24, EDBT Best Demo'25)

290 0 1
nodestream-proj
nodestream-plugin-pedantic

A nodestream plugin that provides a series of audits to ensure high quality and consistent nodestream projects.

281 2 0
treeverse
lakefs-sdk-async

lakeFS - Data version control for your data lake | Git for data

207 5K 446
realdatadriven
etlx-wrapper

Python wrapper for ETLX CLI to run ETL workflows from Python

167 40 3
SRRC-1334
ztract

Extract mainframe EBCDIC data using COBOL copybooks. Zero MIPS. Pure Python + Cobrix engine.

153 0 0
Canner
vulcan-sql

Data API Framework for AI Agents and Data Apps

148 794 42
dlt-hub
dlt-dataops

dlt is an open-source python-first scalable data loading library that does not require any backend to run.

94 5K 498
    • Data from PyPI, GitHub, ClickHouse, and BigQuery