Welcome to Warehouse’s documentation!

Contents:

Development

Warehouse, as an open source project, welcomes contributions of all forms. The sections below will help you get started with development, testing, and documentation.

Please contribute issues, submit bug reports, and file feature requests on our issue tracker on GitHub. If submitting a bug report for the first time, please check out what to put in your bug report for guidance.

Important

We take security very seriously. As such, security issues should be emailed to the maintainers instead of being submitted on the GitHub issue tracker. Please read the Security documentation for details.

Getting started

We’re pleased that you are interested in working on Warehouse.

Setting up a development environment to work on Warehouse should be a straightforward process. If you have any difficulty, please contact us so we can improve the process.

Quickstart for Developers with Docker experience

$ git clone git@github.com:pypa/warehouse.git
$ cd warehouse
$ make serve
$ make initdb

View Warehouse in the browser at http://localhost:80/.

Detailed Installation Instructions

Getting the warehouse source code

Clone the warehouse repository from GitHub:

$ git clone git@github.com:pypa/warehouse.git

Configure the development environment

Why Docker?

Docker simplifies development environment set up.

Warehouse uses Docker and Docker Compose to automate setting up a “batteries included” development environment. The Dockerfile and docker-compose.yml files include all the required steps for installing and configuring all the required external services of the development environment.

Installing Docker
Verifying Docker Installation

Check that Docker is installed: docker -v

Install Docker Compose

Install Docker Compose using the Docker provided installation instructions.

Note

Docker Compose will be installed by Docker for Mac and Docker for Windows automatically.

Verifying Docker Compose Installation

Check that Docker Compose is installed: docker-compose -v

Building the Warehouse Container

Once you have Docker and Docker Compose installed, run:

$ make build

This will pull down all of the required docker containers, build Warehouse and run all of the needed services. The Warehouse repository will be mounted inside of the docker container at /opt/warehouse/src/.

Running the Warehouse Container and Services

After building the Docker container, you’ll need to create a Postgres database and run all of the data migrations.

First start the Docker services that make up the Warehouse application. In one terminal run the command:

$ make serve

Next, you will:

  • create a new Postgres database,
  • install example data to the Postgres database,
  • run migrations, and
  • load some example data from Test PyPI

In a second terminal, separate from the make serve command above, run:

$ make initdb

If you get an error about xz, you may need to install the xz utility. This is highly likely on Mac OS X and Windows.

Note

reCaptcha is featured in authentication and registration pages. To enable it, pass RECAPTCHA_SITE_KEY and RECAPTCHA_SECRET_KEY through to serve and debug targets.

Viewing Warehouse in a browser

Web container is listening on port 80. It’s accessible at http://localhost:80/.

Note

If you are using docker-machine on an older version of Mac OS or Windows, the warehouse application might be accessible at https://<docker-ip>:80/ instead. You can get information about the docker container with docker-machine env

What did we just do and what is happening behind the scenes?

The repository is exposed inside of the web container at /opt/warehouse/src/ and Warehouse will automatically reload when it detects any changes made to the code.

The example data located in dev/example.sql.xz is taken from Test PyPI and has been sanitized to remove anything private. The password for every account has been set to the string password.

Troubleshooting

Errors when executing make serve

  • If the Dockerfile is edited or new dependencies are added (either by you or a prior pull request), a new container will need to built. A new container can be built by running make build. This should be done before running make serve again.
  • If make serve hangs after a new build, you should stop any running containers and repeat make serve.
  • To run Warehouse behind a proxy set the appropriate proxy settings in the Dockerfile.

“no space left on device” when using docker-compose

docker-compose may leave orphaned volumes during teardown. If you run into the message “no space left on device”, try running the following command (assuming Docker >= 1.9):

docker volume rm $(docker volume ls -qf dangling=true)

Note

This will delete orphaned volumes as well as directories that are not volumes in /var/lib/docker/volumes

(Solution found and further details available at https://github.com/chadoe/docker-cleanup-volumes)

Building Styles

Styles are written in the scss variant of Sass and compiled using Gulp. They will be automatically built when changed when make serve is running.

Running the Interactive Shell

There is an interactive shell available in Warehouse which will automatically configure Warehouse and create a database session and make them available as variables in the interactive shell.

To run the interactive shell, simply run:

$ make shell

The interactive shell will have the following variables defined in it:

config The Pyramid Configurator object which has already been configured by Warehouse.
db The SQLAlchemy ORM Session object which has already been configured to connect to the database.

Running tests and linters

Note

PostgreSQL 9.4 is required because of pgcrypto extension

The Warehouse tests are found in the tests/ directory and are designed to be run using make.

To run all tests, all you have to do is:

$ make tests

This will run the tests with the supported interpreter as well as all of the additional testing that we require.

If you want to run a specific test, you can use the T variable:

$ T=tests/unit/i18n/test_filters.py make tests

You can run linters, programs that check the code, with:

$ make lint

Building documentation

The Warehouse documentation is stored in the docs/ directory. It is written in reStructured Text and rendered using Sphinx.

Use make to build the documentation. For example:

$ make docs

The HTML documentation index can now be found at docs/_build/html/index.html.

Submitting patches

  • Always make a new branch for your work.
  • Patches should be small to facilitate easier review. Studies have shown that review quality falls off as patch size grows. Sometimes this will result in many small PRs to land a single large feature.
  • You must have legal permission to distribute any code you contribute to Warehouse, and it must be available under the Apache Software License Version 2.0.

If you believe you’ve identified a security issue in Warehouse, please follow the directions on the security page.

Code

When in doubt, refer to PEP 8 for Python code. You can check if your code meets our automated requirements by running make lint against it.

Write comments as complete sentences.

Class names which contains acronyms or initialisms should always be capitalized. A class should be named HTTPClient, not HttpClient.

Every code file must start with the boilerplate licensing notice:

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

You can view Patterns to see more patterns that should be used within Warehouse.

Tests

All code changes must be accompanied by unit tests with 100% code coverage (as measured by coverage.py).

Documentation

Important information should be documented with prose in the docs section. To ensure it builds and passes doc8 style checks you can run make docs and make lint respectively.

Frontend

The Warehouse frontend is (as you might suspect) written in JavaScript with the CSS handled by SCSS. It uses gulp to process these files and prepare them for serving.

All of the static files are located in warehouse/static/ and external libraries are found in package.json.

Building

Static files should be automatically built when make serve is running, however you can trigger a manual build of them by installing all of the dependencies using npm install and then running gulp dist.

Browser Support

Browser Supported Versions
Chrome Current, Current - 1
Firefox Current, Current - 1
Edge Current, Current - 1
Opera Current, Current - 1
Safari 9.0+
IE 11+

HTML Code Style

Warehouse follows the Google HTML style guide, which is enforced via linting with HTML Linter.

Exceptions to these rules include:

  • Protocols can be included in links - we prefer to include https protocols
  • All HTML tags should be closed

We also allow both dashes and underscores in our class names, as we follow the Nicholas Gallagher variation of the BEM naming methodology.

More information on how BEM works can be found in this article from CSS Wizardry.

SCSS Style and Structure

Warehouse follows the Airbnb CSS / Sass style guide, with the exception that JS hooks should be prefixed with -js rather than js.

Our SCSS codebase is structured according to the ITCSS system. The principle of this system is to break SCSS code into layers and import them into a main stylesheet in an order moving from generic to specific. This tightly controls the cascade of styles.

The majority of the SCSS styles are found within the ‘blocks’ layer, with each BEM block in its own file. All blocks are documented at the top of the file to provide guidelines for use and modification.

Patterns

Returning vs Raising HTTP Exceptions

Pyramid allows the various HTTP Exceptions to be either returned or raised, and the difference between whether you return or raise them are subtle. The differences between returning and raising a response are:

  • Returning a response commits the transaction associated with the request, while raising rolls it back.
  • Returning a response does not invoke the exec_view handler, while raising does.

The follow table shows what the default method should be for each type of HTTP exception, this is only the default and judgement should be applied to each situation.

Class Method
HTTPSuccessful (2xx) Return
HTTPRedirection (3xx) Return
HTTPClientError (4xx) Raise, except for HTTPNotFound which should be return.
HTTPServerError (5xx) Raise

Database Migrations

Modifying database schema will need database migrations (except for removing and adding tables). To autogenerate migrations:

$ docker-compose run web python -m warehouse db revision

Then migrate and test your migration:

$ docker-compose run web python -m warehouse db upgrade head

Migrations are automatically ran as part of the deployment process, but prior to the old version of Warehouse from being shut down. This means that each migration must be compatible with the current master branch of Warehouse.

This makes it more difficult to making breaking changes, since you must phase them in over time (for example, to rename a column you must add the column in one migration + start writing to that column/reading from both, then you must make a migration that backfills all of the data, then switch the code to stop using the old column all together, then finally you can remove the old column).

Reviewing and merging patches

Everyone is encouraged to review open pull requests. We only ask that you try and think carefully, ask questions and are excellent to one another. Code review is our opportunity to share knowledge, design ideas and make friends.

When reviewing a patch try to keep each of these concepts in mind:

Architecture

  • Is the proposed change being made in the correct place?

Intent

  • What is the change being proposed?
  • Do we want this feature or is the bug they’re fixing really a bug?

Implementation

  • Does the change do what the author claims?
  • Are there sufficient tests?
  • Should and has it been documented?
  • Will this change introduce new bugs?

Grammar and style

These are small things that are not caught by the automated style checkers.

  • Does a variable need a better name?
  • Should this be a keyword argument?

Merge requirements

  • Patches must never be pushed directly to master, all changes (even the most trivial typo fixes!) must be submitted as a pull request.
  • A patch that breaks tests, or introduces regressions by changing or removing existing tests should not be merged. Tests must always be passing on master.
  • If somehow the tests get into a failing state on master (such as by a backwards incompatible release of a dependency) no pull requests may be merged until this is rectified.
  • All merged patches must have 100% test coverage.
  • All user facing strings must be marked for translation and the .pot and .po files must be updated.

Application Structure

Note: this is a brain dump and its contents may be moved to a more appropriate location eventually.

At the moment it just lists the legacy structure and none of the intended new structure.

The following documents the current URLs in the legacy PyPI application.

URL Purpose
/ Redirect to /pypi
/pypi Legacy PyPI application. See below.
/daytime Legacy mirroring support
/security Page giving contact and other information regarding site security
/id OpenID endpoint
/oauth OAuth endpoint
/simple Simple API as given in Legacy API
/packages Serve up a package file
/mirrors Page listing legacy mirrors (not to be retained)
/serversig Legacy mirroring support (no-one uses it: not to be retained)
/raw-packages nginx implementation specific hackery (entirely internal; not to be retained)
/stats Web stats. Whatever. Probably dead.
/local-stats Package download stats. All the legacy mirrors have this.
/static Static files (CSS, images) in support of the web interface.

The legacy application has a bunch of different behaviours:

  1. With no additional path, parameter or content-type information the app renders a “front page” for the site. TODO: keep this behaviour or redirect?
  2. With a content-type of “text/xml” the app runs in an XML-RPC server mode.
  3. With certain path information the app will render project information.
  4. With an :action parameter the app will take certain actions and/or display certain information.

The :action parameters are typically submitted through GET URL parameters, though some actions are also POST actions.

could be nuked without fuss
  • display was used to display a package version but was replaced ages ago by the /<package>/<version> URL structure
  • all the user-based stuff like register_form, user, user_form, forgotten_password_form, login, logout, forgotten_password, password_reset, pw_reset and pw_reset_change will most likely be replaced by newer mechanisms in warehouse
  • openid_endpoint, openid_decide_post could also be replaced by something else.
  • home is the old home page thing and completely unnecessary
  • index is overwhelming given the number of projects now.
  • browse and search are probably only referenced by internal links so should be safe to nuke
  • submit_pkg_info and display_pkginfo probably aren’t used
  • submit_form and pkg_edit will be changing anyway
  • files, urls, role, role_form are old style and will be changing
  • list_classifiers .. this might actually only be used by Richard :)
  • claim, openid, openid_return, dropid are legacy openid login support and will be changing
  • clear_auth “clears” Basic Auth
  • addkey, delkey will be changing if we even keep supporting ssh submit
  • verify probably isn’t actually used by anyone
  • lasthour is a pubsubhubbub thing - does this even exist any longer?
  • json is never used as a :action invocation, only ever /<package>/json
  • gae_file I’m pretty sure this is not necessary
  • rss_regen manually regens the RSS cached files, not needed
  • about No longer needed.
  • delete_user No longer needed.
  • exception No longer needed.
will need to retain
  • rss and packages_rss will be in a bunch of peoples` RSS readers
  • doap is most likely referred to
  • show_md5 ?
can be deprecated carefully
  • submit, upload, doc_upload, file_upload,

API Reference

PyPI’s XML-RPC methods

Example usage:

>>> import xmlrpclib
>>> import pprint
>>> client = xmlrpclib.ServerProxy('https://pypi.python.org/pypi')
>>> client.package_releases('roundup')
['1.4.10']
>>> pprint.pprint(client.release_urls('roundup', '1.4.10'))
[{'comment_text': '',
  'downloads': 3163,
  'filename': 'roundup-1.1.2.tar.gz',
  'has_sig': True,
  'md5_digest': '7c395da56412e263d7600fa7f0afa2e5',
  'packagetype': 'sdist',
  'python_version': 'source',
  'size': 876455,
  'upload_time': <DateTime '20060427T06:22:35' at 912fecc>,
  'url': 'https://pypi.python.org/packages/source/r/roundup/roundup-1.1.2.tar.gz'},
 {'comment_text': '',
  'downloads': 2067,
  'filename': 'roundup-1.1.2.win32.exe',
  'has_sig': True,
  'md5_digest': '983d565b0b87f83f1b6460e54554a845',
  'packagetype': 'bdist_wininst',
  'python_version': 'any',
  'size': 614270,
  'upload_time': <DateTime '20060427T06:26:04' at 912fdec>,
  'url': 'https://pypi.python.org/packages/any/r/roundup/roundup-1.1.2.win32.exe'}]

Changes to Legacy API

package_releases The show_hidden flag is now ignored. All versions are returned.

release_data The stable_version flag is always an empty string. It was never fully supported anyway.

Package Querying

list_packages()
Retrieve a list of the package names registered with the package index. Returns a list of name strings.
package_releases(package_name, show_hidden=False)

Retrieve a list of the releases registered for the given package_name, ordered by version.

The show_hidden flag is now ignored. All versions are returned.

package_roles(package_name)
Retrieve a list of [role, user] for a given package_name. Role is either Maintainer or Owner.
user_packages(user)
Retrieve a list of [role, package_name] for a given user. Role is either Maintainer or Owner.
release_downloads(package_name, release_version)
Retrieve a list of [filename, download_count] for a given package_name and release_version.
release_urls(package_name, release_version)

Retrieve a list of download URLs for the given release_version. Returns a list of dicts with the following keys:

  • url
  • packagetype (‘sdist’, ‘bdist_wheel’, etc)
  • filename
  • size
  • md5_digest
  • downloads
  • has_sig
  • python_version (required version, or ‘source’, or ‘any’)
  • comment_text
release_data(package_name, release_version)

Retrieve metadata describing a specific release_version. Returns a dict with keys for:

  • name
  • version
  • stable_version (always an empty string)
  • author
  • author_email
  • maintainer
  • maintainer_email
  • home_page
  • license
  • summary
  • description
  • keywords
  • platform
  • download_url
  • classifiers (list of classifier strings)
  • requires
  • requires_dist
  • provides
  • provides_dist
  • requires_external
  • requires_python
  • obsoletes
  • obsoletes_dist
  • project_url
  • docs_url (URL of the packages.python.org docs if they’ve been supplied)

If the release does not exist, an empty dictionary is returned.

search(spec[, operator])

Search the package database using the indicated search spec.

The spec may include any of the keywords described in the above list (except ‘stable_version’ and ‘classifiers’), for example: {‘description’: ‘spam’} will search description fields. Within the spec, a field’s value can be a string or a list of strings (the values within the list are combined with an OR), for example: {‘name’: [‘foo’, ‘bar’]}. Valid keys for the spec dict are listed here. Invalid keys are ignored:

  • name
  • version
  • author
  • author_email
  • maintainer
  • maintainer_email
  • home_page
  • license
  • summary
  • description
  • keywords
  • platform
  • download_url

Arguments for different fields are combined using either “and” (the default) or “or”. Example: search({‘name’: ‘foo’, ‘description’: ‘bar’}, ‘or’). The results are returned as a list of dicts {‘name’: package name, ‘version’: package release version, ‘summary’: package release summary}

browse(classifiers)
Retrieve a list of [name, version] of all releases classified with all of the given classifiers. classifiers must be a list of Trove classifier strings.
top_packages([number])
Retrieve the sorted list of packages ranked by number of downloads. Optionally limit the list to the number given.
updated_releases(since)
Retrieve a list of package releases made since the given timestamp. The releases will be listed in descending release date.
changed_packages(since)
Retrieve a list of package names where those packages have been changed since the given timestamp. The packages will be listed in descending date of most recent change.

Mirroring Support

changelog(since, with_ids=False)
Retrieve a list of [name, version, timestamp, action], or [name, version, timestamp, action, id] if with_ids=True, since the given since. All since timestamps are UTC values. The argument is a UTC integer seconds since the epoch.
changelog_last_serial()
Retrieve the last event’s serial id.
changelog_since_serial(since_serial)
Retrieve a list of (name, version, timestamp, action, serial) since the event identified by the given since_serial All timestamps are UTC values. The argument is a UTC integer seconds since the epoch.
list_packages_with_serial()
Retrieve a dictionary mapping package names to the last serial for each package.

Legacy API

Simple Project API

GET /simple/

All of the projects that have been registered. All responses MUST have a <meta name="api-version" value="2" /> tag where the only valid value is 2.

Example request:

GET /simple/ HTTP/1.1
Host: pypi.python.org
Accept: text/html

Example response:

HTTP/1.0 200 OK
Content-Type: text/html; charset=utf-8
X-PyPI-Last-Serial: 871501

<!DOCTYPE html>
<html>
  <head>
    <title>Simple Index</title>
    <meta name="api-version" value="2" />
  </head>
  <body>
    <!-- More projects... -->
    <a href="/simple/warehouse/">warehouse</a>
    <!-- ...More projects -->
  </body>
</html>
Response Headers:
 
  • X-PyPI-Last-Serial – The most recent serial id number for any project.
Status Codes:
GET /simple/<project>/

Get all of the URLS for the project. The project is matched case insensitively with the _ and - characters considered equal. All responses MUST have a <meta name="api-version" value="2" /> tag where the only valid value is 2. The URLs returned by this API are classified by their rel attribute.

rel name value
internal Packages hosted by this repository, MUST be a direct package link.
homepage The homepage of the project, MAY be a direct package link and MAY be fetched and processed for more direct package links.
download The download url for the project, MAY be a direct package link and MAY be fetched and processed for more direct package links.
ext-homepage The homepage of the project, MUST not be fetched to look for more packages, MAY be a direct link.
ext-download The download url for the project, MUST not be fetched to look for more packages but MAY be a direct package link.
external An externally hosted url, MUST not be fetched to look for more packages but MAY be a direct package link.

The links may optionally include a hash using the url fragment. This fragment is in the form of #<hashname>=<hexdigest>. If present the downloaded file MUST be verified against that hash value. Valid hash values are md5, sha1, sha224, sha256, sha384, and sha512.

Example request:

GET /simple/warehouse/ HTTP/1.1
Host: pypi.python.org
Accept: text/html

Example response:

HTTP/1.0 200 OK
Content-Type: text/html; charset=utf-8
X-PyPI-Last-Serial: 867465

<!DOCTYPE html>
<html>
  <head>
    <title>Links for warehouse</title>
    <meta name="api-version" value="2" />
  </head>
  <body>
    <h1>Links for warehouse</h1>
    <a rel="internal" href="../../packages/source/w/warehouse/warehouse-13.9.1.tar.gz#md5=f7f467ab87637b4ba25e462696dfc3b4">warehouse-13.9.1.tar.gz</a>
    <a rel="internal" href="../../packages/3.3/w/warehouse/warehouse-13.9.1-py2.py3-none-any.whl#md5=d105995d0b3dc91f938c308a23426689">warehouse-13.9.1-py2.py3-none-any.whl</a>
    <a rel="internal" href="../../packages/source/w/warehouse/warehouse-13.9.0.tar.gz#md5=b39322c1e6af3dda210d75cf65a14f4c">warehouse-13.9.0.tar.gz</a>
    <a rel="internal" href="../../packages/3.3/w/warehouse/warehouse-13.9.0-py2.py3-none-any.whl#md5=8767c0ed961ee7bc9e5e157998cd2b40">warehouse-13.9.0-py2.py3-none-any.whl</a>
  </body>
</html>
Response Headers:
 
  • X-PyPI-Last-Serial – The most recent serial id number for the project.
Status Codes:

UI Principles

The Warehouse UI aims to be clean, clear and easy to use. Changes and additions to the UI should follow these four principles:

1. Be Consistent

Creating consistent interfaces is more aesthetically pleasing, improves usability and helps new users master the UI faster.

Before creating a new design, layout or CSS style, always consider reusing an existing pattern. This may include modifying an existing design or layout to make it more generic.

Following this principle can also help to reduce the footprint of our frontend code, which will make Warehouse easier to maintain in the long term.

2. Consider Usability and Accessibility

Ensuring Warehouse follows usability and accessibility best practices will make the site easier to use for everybody. At a minimum:

  • Ensure contrast is high, particularly on text. This can be checked:
  • Write semantic HTML
  • Ensure image alt tags are present and meaningful
  • Add labels to all form fields (if you want to hide a label visually but leave it readable to screen readers, apply .sr-only)
  • Where possible add ARIA roles to the HTML
  • Indicate the state of individual UI components with CSS styles. For example, darken a button on hover.
  • Ensure that keyboard users can easily navigate each page. It is particularly important that the outline is not removed from links.
  • Consider color blind users: if using color to convey meaning (e.g. red for an error) always use an additional indicator (e.g. an appropriate icon) to convey the same meaning.

3. Provide Help

Never assume that all Warehouse users are as familiar with the Python ecosystem as you are. Something that may seem obvious or second-nature to you may be a difficult or novel concept for someone else.

Seek out places in the interface where help text should be included - either as standard text on the page, or by adding a help icon (that links to help content).

4. Write Clearly, with Consistent Style and Terminology

Warehouse follows the Material design writing style guide.

When writing interfaces use direct, clear and simple language. This is especially important as Warehouse caters to an international audience with varying proficiency in English. If you’re unsure, please check the readability of your text.

Be consistent, particularly when it comes to domain specific words. Use this glossary as a guide:

Term Definition
Project A collection of releases and files, and information about them. Projects on Warehouse are made and shared by members of the Python community so others can use them.
Release A specific version of a project. For example, the requests project has many releases, like requests 2.10 and requests 1.2.1. A release consists of one or more files.
File Something that you can download and install. Because of different hardware, operating systems, and file formats, a release may have several files, like an archive containing source code or a binary wheel.
Package A synonym for a file.
User A person who has registered an account on Warehouse.
Maintainer A user who has permissions to manage a project on Warehouse.
Owner A user who has permissions to manage a project on Warehouse, and has additional permission to add and remove other maintainers and owners to a project.
Author A free-form piece of information associated with a project. This information could be a name of a person, an organization, or something else altogether. This information is not linked to a user on Warehouse.

Security

To read the most up to date version of our security policy, please visit the application security page, available via the site footer.

Warehouse is a new code base that implements a Python package repository. It is being actively developed and the plan is that it will eventually power PyPI and replace an older code base that is currently powering PyPI. You can see Warehouse in production at https://pypi.org/

The goal is to improve PyPI by making it:

  • be more user-friendly
  • have a more modern look
  • more features
  • remove legacy APIs
  • have more maintainable code with test coverage, docs, etc.

Indices and tables