PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Data Collection Python Packages

Python packages with the GitHub topic data-collection. Sorted by relevance, with stars and monthly downloads.
airbytehq
airbyte-source-declarative-manifest

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

326K 21K 5K
airbytehq
airbyte-source-facebook-marketing

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

24K 21K 5K
airbytehq
airbyte-source-google-ads

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

22K 21K 5K
airbytehq
airbyte-source-s3

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

20K 21K 5K
chapmanjacobd
xklb

99+ CLI tools to build, browse, and blend your media library

17K 477 14
airbytehq
airbyte-source-salesforce

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

15K 21K 5K
oxylabs
oxylabs-mcp

Official Oxylabs MCP integration

15K 94 24
airbytehq
airbyte-source-github

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

14K 21K 5K
airbytehq
airbyte-source-shopify

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

12K 21K 5K
airbytehq
airbyte-source-google-sheets

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

11K 21K 5K
airbytehq
airbyte-source-zendesk-support

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

11K 21K 5K
Altimis
scweet

Scrape tweets, profiles, followers and following from Twitter/X, no API key needed. Python library with smart multi-account pooling, proxy support and async.

10K 1K 267
airbytehq
airbyte-source-google-drive

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

10K 21K 5K
airbytehq
airbyte-source-faker

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

9K 21K 5K
airbytehq
airbyte-source-bing-ads

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

8K 21K 5K
airbytehq
airbyte-source-marketo

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

8K 21K 5K
airbytehq
airbyte-source-gcs

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

8K 21K 5K
chapmanjacobd
library

99+ CLI tools to build, browse, and blend your media library

8K 477 14
airbytehq
airbyte-source-google-analytics-data-api

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

8K 21K 5K
airbytehq
airbyte-source-hubspot

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

7K 21K 5K
airbytehq
airbyte-source-stripe

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

7K 21K 5K
airbytehq
airbyte-source-jira

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

7K 21K 5K
airbytehq
airbyte-source-google-search-console

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

7K 21K 5K
airbytehq
airbyte-source-iterable

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

6K 21K 5K
    • Data from PyPI, GitHub, ClickHouse, and BigQuery