We use BigQuery to serve our public datasets. PyPI offers two tables whose data is sourced from projects on PyPI. The tables and its pertaining data are licensed under the Creative Commons License.
Download Statistics Table#
The download statistics table allows you learn more about downloads patterns of packages hosted on PyPI. This table is populated through the Linehaul project by streaming download logs from PyPI to BigQuery. For more information on analyzing PyPI package downloads, see the Python Package Guide
Project Metadata Table#
We also have a table that provides access to distribution metadata
as outlined by the core metadata specifications.
The table is meant to be a data dump of metadata from every
release on PyPI, which means that the rows in this BigQuery table
are immutable and are not removed even if a release or project is deleted.
This data can be accessible under the
bigquery-public-data.pypi.distribution_metadata public dataset on BigQuery.