PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Hdfs Python Packages

Python packages with the GitHub topic hdfs. Sorted by relevance, with stars and monthly downloads.
piskvorky
smart-open

Utils for streaming large files (S3, HDFS, gzip, bz2...)

71.2M 3K 385
TileDB-Inc
tiledb

Python interface to the TileDB storage engine

86K 202 38
jcrist
skein

A tool and library for easily deploying applications on Apache YARN

56K 146 39
iterative
dvc-hdfs

HDFS/WebHDFS plugin for dvc

33K 2 1
megvii-research
megfile

Megvii FILE Library - Working with Files in Python same as the standard library

31K 174 20
wradlib
wradlib

weather radar data processing - python package

14K 308 88
jingw
pyhdfs

Python HDFS client

6K 97 23
spotify
snakebite

A pure python HDFS client

5K 859 213
criteo
cluster-pack

A library on top of either pex or conda-pack to make your Python code easily available on a cluster

1K 46 23
tks18
pyquery-polars

PyQuery is a local-first data operating system built on lazy execution that processes 100GB+ files while you doomscroll. No cap. 🧢

1K 1 0
BROADSoftware
hadeploy

An Hadoop Application Deployment tool

877 9 4
fasouto
webhdfspy

Python wrapper to access Hadoop HDFS REST API

713 8 5
IBMStreams
streamsx-hdfs

HDFS integration for IBM Streams

614 9 20
canimus
alphareader

A reader for large files with custom delimiters and encodings

483 6 1
ab2dridi
lakekeeper

A configurable PySpark package to identify fragmented external tables and perform safe in-place compaction

433 0 0
tks18
pyquery-core

PyQuery is a local-first data operating system built on lazy execution that processes 100GB+ files while you doomscroll. No cap. 🧢

119 1 0
yassineazzouz
pydistcp

pydistcp: python WebHDFS inter/intra-cluster data copy tool.

105 9 3
qiyangduan
schemaindex

SchemaIndex is designed for data scientists to index and search metadata more efficiently.

98 3 1
ceph
test-cephadm

Ceph is a distributed object, block, and file storage platform

95 17K 6K
silkway-ai
dfspy

Distributed File System written in Python

92 14 0
yassineazzouz
kraken-pyds

Kraken - A distributed data transfer tool.

81 2 1
marco-gallegos
sqoopit

A simple package to let you Sqoop into HDFS/Hive/HBase with python

63 0 0
piskvorky
srcd-smart-open

Utils for streaming large files (S3, HDFS, gzip, bz2...) - temporary source{d} fork

59 3K 385
yassineazzouz
tanit

Kraken - A distributed data transfer tool.

44 2 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery