Warehouse codebase¶
Warehouse uses the Pyramid web framework, the SQLAlchemy ORM, and Postgres for its database. Warehouse’s front end uses Jinja2 templates.
The production deployment for Warehouse is deployed using Cabotage, which manages Docker containers deployed via Kubernetes.
In the development environment, we use several Docker containers orchestrated by Docker Compose to manage running the containers and the connections between them.
Since Warehouse was built on top of an existing database (for legacy PyPI) and developers had to fit our ORM to the existing tables, some of the code in the ORM may not look like code from the SQLAlchemy documentation. There are some places where joins are done using name-based logic instead of a foreign key (but this may change in the future).
Warehouse also uses hybrid URL traversal and dispatch. Using factory classes, resources are provided directly to the views based on the URL pattern. This method of handling URLs may be unfamiliar to developers used to other web frameworks, such as Django or Flask. This article has a helpful discussion of the differences between URL dispatch and traversal in Pyramid.
Usage assumptions and concepts¶
See PyPI help and the glossary section of UI principles to understand projects, releases, packages, maintainers, authors, and owners.
Warehouse is specifically the codebase for the official Python Package Index, and thus focuses on architecture and features for PyPI and Test PyPI. People and groups who want to run their own package indexes usually use other tools, like devpi.
Warehouse serves four main classes of users:
People who are not logged in. This accounts for the majority of browser traffic and all API download traffic.
Owners/maintainers of one or more projects. This accounts for almost all writes. A user must create and use a PyPI account to maintain or own a project, and there is no particular functionality available to a logged-in user other than to manage projects they own/maintain. As of March 2018, PyPI had about 270,000 users, and Test PyPI had about 30,000 users.
PyPI application moderators. These users have a subset of the permissions of PyPI application administrators to assist in some routine administration tasks such as adding new trove classifiers, and adjusting upload limits for distribution packages.
PyPI application administrators, e.g., Ee Durbin, Dustin Ingram, and Donald Stufft, who can ban spam/malware projects, help users with account recovery, and so on. There are fewer than ten such admins.
Since reads are much more common than writes (much more goes out than goes in), we try to cache as much as possible.
File and directory structure¶
The top-level directory of the Warehouse repo contains files including:
LICENSE
CONTRIBUTING.rst
(the contribution guide)README.rst
requirements.txt
for the Warehouse virtual environmentDockerfile
: creates the Docker containers that Warehouse runs indocker-compose.yml
file configures Docker Composesetup.cfg
for test configurationMakefile
: commands to spin up Docker Compose and the Docker containers, run the linter and other tests, etc.files associated with Warehouse’s front end, e.g.,
webpack.config.js
Directories within the repository:
bin/ - high-level scripts for Docker, Continuous Integration, and Makefile commands
dev/ - assets for developer environment
tests/ - tests
warehouse/ - code in modules
accounts/ - user accounts
admin/ - application-administrator-specific
banners/ - notification banners
cache/ - caching
classifiers/ - frame trove classifiers
cli/ - entry scripts and the interactive shell
email/ - services for sending emails
i18n/ - internationalization
integrations/ - integrations with other services
legacy/ - most of the read-only APIs implemented here
locale/ - internationalization
macaroons/ - API token support
manage/ - logged-in user functionality (i.e., manage account & owned/maintained projects)
metrics/ - services for recording metrics
migrations/ - changes to the database schema
oidc/ - Trusted Publishing support
organizations/ - organization accounts
packaging/ - core packaging models (projects, releases, files)
rate_limiting/ - rate limiting to prevent abuse
search/ - utilities for building and querying the search index
sitemap/ - site maps
sponsors/ - sponsors management
static/ - static site assets
templates/ - Jinja templates for web pages, emails, etc.
utils/ - various utilities Warehouse uses
Historical context & deprecations¶
For the history of Python packaging and distribution, see the PyPA history page.
From the early 2000s till April 2018, the legacy PyPI codebase, not Warehouse, powered PyPI. Warehouse deliberately does not provide some features that users may be used to from the legacy site, such as:
“hidden releases”
uploading to pythonhosted.com documentation hosting (discussion and plans)
download counts visible in the API (instead, use the Google BigQuery service)
key management: PyPI no longer has a UI for users to manage GPG or SSH public keys
uploading new releases via the web UI: instead, maintainers should use the command-line tool Twine
updating release descriptions via the web UI: instead, to update release metadata, you need to upload a new release (discussion)
uploading a package without first verifying an email address
GPG/PGP signatures for packages (no longer visible in the web UI or index, but retrievable by appending an
.asc
if the signature exists)OpenID and Google auth login are no longer supported.