Welcome to Warehouse’s documentation!¶
Contents:
Development¶
Warehouse, as an open source project, welcomes contributions of all forms. The sections below will help you get started with development, testing, and documentation.
Please contribute issues, submit bug reports, and file feature requests on our issue tracker on GitHub. If submitting a bug report for the first time, please check out what to put in your bug report for guidance.
Important
We take security very seriously. As such, security issues should be emailed to the maintainers instead of being submitted on the GitHub issue tracker. Please read the Security documentation for details.
Getting started¶
We’re pleased that you are interested in working on Warehouse.
Setting up a development environment to work on Warehouse should be a straightforward process. If you have any difficulty, please contact us so we can improve the process.
Quickstart for Developers with Docker experience¶
$ git clone git@github.com:pypa/warehouse.git
$ cd warehouse
$ make serve
$ make initdb
View Warehouse in the browser at http://localhost:80/
.
Detailed Installation Instructions¶
Getting the warehouse source code¶
Clone the warehouse repository from GitHub:
$ git clone git@github.com:pypa/warehouse.git
Configure the development environment¶
Why Docker?¶
Docker simplifies development environment set up.
Warehouse uses Docker and Docker Compose
to automate setting up a “batteries included” development environment.
The Dockerfile and docker-compose.yml
files include all the required steps
for installing and configuring all the required external services of the
development environment.
Installing Docker¶
- Install Docker Engine
Verifying Docker Installation¶
Check that Docker is installed: docker -v
Install Docker Compose¶
Install Docker Compose using the Docker provided installation instructions.
Note
Docker Compose will be installed by Docker for Mac and Docker for Windows automatically.
Verifying Docker Compose Installation¶
Check that Docker Compose is installed: docker-compose -v
Building the Warehouse Container¶
Once you have Docker and Docker Compose installed, run:
$ make build
This will pull down all of the required docker containers, build
Warehouse and run all of the needed services. The Warehouse repository will be
mounted inside of the docker container at /opt/warehouse/src/
.
Running the Warehouse Container and Services¶
After building the Docker container, you’ll need to create a Postgres database and run all of the data migrations.
First start the Docker services that make up the Warehouse application. In one terminal run the command:
$ make serve
Next, you will:
- create a new Postgres database,
- install example data to the Postgres database,
- run migrations, and
- load some example data from Test PyPI
In a second terminal, separate from the make serve command above, run:
$ make initdb
If you get an error about xz, you may need to install the xz utility. This is highly likely on Mac OS X and Windows.
Note
reCaptcha is featured in authentication and registration pages. To
enable it, pass RECAPTCHA_SITE_KEY
and RECAPTCHA_SECRET_KEY
through to serve
and debug
targets.
Viewing Warehouse in a browser¶
Web container is listening on port 80. It’s accessible at
http://localhost:80/
.
Note
If you are using docker-machine
on an older version of Mac OS or
Windows, the warehouse application might be accessible at
https://<docker-ip>:80/
instead. You can get information about the
docker container with docker-machine env
What did we just do and what is happening behind the scenes?¶
The repository is exposed inside of the web container at
/opt/warehouse/src/
and Warehouse will automatically reload when it detects
any changes made to the code.
The example data located in dev/example.sql.xz
is taken from
Test PyPI and has been sanitized to remove
anything private. The password for every account has been set to the string
password
.
Troubleshooting¶
Errors when executing make serve
¶
- If the
Dockerfile
is edited or new dependencies are added (either by you or a prior pull request), a new container will need to built. A new container can be built by runningmake build
. This should be done before runningmake serve
again. - If
make serve
hangs after a new build, you should stop any running containers and repeatmake serve
. - To run Warehouse behind a proxy set the appropriate proxy settings in the
Dockerfile
.
“no space left on device” when using docker-compose
¶
docker-compose
may leave orphaned volumes during teardown. If you run
into the message “no space left on device”, try running the following command
(assuming Docker >= 1.9):
docker volume rm $(docker volume ls -qf dangling=true)
Note
This will delete orphaned volumes as well as directories that are not volumes in /var/lib/docker/volumes
(Solution found and further details available at https://github.com/chadoe/docker-cleanup-volumes)
Building Styles¶
Styles are written in the scss variant of Sass and compiled using Gulp. They
will be automatically built when changed when make serve
is running.
Running the Interactive Shell¶
There is an interactive shell available in Warehouse which will automatically configure Warehouse and create a database session and make them available as variables in the interactive shell.
To run the interactive shell, simply run:
$ make shell
The interactive shell will have the following variables defined in it:
config | The Pyramid Configurator object which has already been configured by
Warehouse. |
db | The SQLAlchemy ORM Session object which has already been configured
to connect to the database. |
Running tests and linters¶
Note
PostgreSQL 9.4 is required because of pgcrypto extension
The Warehouse tests are found in the tests/
directory and are designed to
be run using make.
To run all tests, all you have to do is:
$ make tests
This will run the tests with the supported interpreter as well as all of the additional testing that we require.
If you want to run a specific test, you can use the T
variable:
$ T=tests/unit/i18n/test_filters.py make tests
You can run linters, programs that check the code, with:
$ make lint
Building documentation¶
The Warehouse documentation is stored in the docs/
directory. It is written
in reStructured Text and rendered using Sphinx.
Use make to build the documentation. For example:
$ make docs
The HTML documentation index can now be found at
docs/_build/html/index.html
.
Submitting patches¶
- Always make a new branch for your work.
- Patches should be small to facilitate easier review. Studies have shown that review quality falls off as patch size grows. Sometimes this will result in many small PRs to land a single large feature.
- You must have legal permission to distribute any code you contribute to Warehouse, and it must be available under the Apache Software License Version 2.0.
If you believe you’ve identified a security issue in Warehouse, please follow the directions on the security page.
Code¶
When in doubt, refer to PEP 8 for Python code. You can check if your code
meets our automated requirements by running make lint
against it.
Write comments as complete sentences.
Class names which contains acronyms or initialisms should always be
capitalized. A class should be named HTTPClient
, not HttpClient
.
Every code file must start with the boilerplate licensing notice:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
You can view Patterns to see more patterns that should be used within Warehouse.
Tests¶
All code changes must be accompanied by unit tests with 100% code coverage (as measured by coverage.py).
Frontend¶
The Warehouse frontend is (as you might suspect) written in JavaScript with the CSS handled by SCSS. It uses gulp to process these files and prepare them for serving.
All of the static files are located in warehouse/static/
and external
libraries are found in package.json
.
Building¶
Static files should be automatically built when make serve
is running,
however you can trigger a manual build of them by installing all of the
dependencies using npm install
and then running gulp dist
.
Browser Support¶
Browser | Supported Versions |
---|---|
Chrome | Current, Current - 1 |
Firefox | Current, Current - 1 |
Edge | Current, Current - 1 |
Opera | Current, Current - 1 |
Safari | 9.0+ |
IE | 11+ |
HTML Code Style¶
Warehouse follows the Google HTML style guide, which is enforced via linting with HTML Linter.
Exceptions to these rules include:
- Protocols can be included in links - we prefer to include https protocols
- All HTML tags should be closed
We also allow both dashes and underscores in our class names, as we follow the Nicholas Gallagher variation of the BEM naming methodology.
More information on how BEM works can be found in this article from CSS Wizardry.
SCSS Style and Structure¶
Warehouse follows the Airbnb CSS / Sass style guide,
with the exception that JS hooks should be prefixed with -js
rather
than js
.
Our SCSS codebase is structured according to the ITCSS system. The principle of this system is to break SCSS code into layers and import them into a main stylesheet in an order moving from generic to specific. This tightly controls the cascade of styles.
The majority of the SCSS styles are found within the ‘blocks’ layer, with each BEM block in its own file. All blocks are documented at the top of the file to provide guidelines for use and modification.
Patterns¶
Returning vs Raising HTTP Exceptions¶
Pyramid allows the various HTTP Exceptions to be either returned or raised, and the difference between whether you return or raise them are subtle. The differences between returning and raising a response are:
- Returning a response commits the transaction associated with the request, while raising rolls it back.
- Returning a response does not invoke the
exec_view
handler, while raising does.
The follow table shows what the default method should be for each type of HTTP exception, this is only the default and judgement should be applied to each situation.
Class | Method |
---|---|
HTTPSuccessful (2xx) |
Return |
HTTPRedirection (3xx) |
Return |
HTTPClientError (4xx) |
Raise, except for HTTPNotFound which should be
return. |
HTTPServerError (5xx) |
Raise |
Database Migrations¶
Modifying database schema will need database migrations (except for removing and adding tables). To autogenerate migrations:
$ docker-compose run web python -m warehouse db revision
Then migrate and test your migration:
$ docker-compose run web python -m warehouse db upgrade head
Migrations are automatically ran as part of the deployment process, but prior
to the old version of Warehouse from being shut down. This means that each
migration must be compatible with the current master
branch of Warehouse.
This makes it more difficult to making breaking changes, since you must phase them in over time (for example, to rename a column you must add the column in one migration + start writing to that column/reading from both, then you must make a migration that backfills all of the data, then switch the code to stop using the old column all together, then finally you can remove the old column).
Reviewing and merging patches¶
Everyone is encouraged to review open pull requests. We only ask that you try and think carefully, ask questions and are excellent to one another. Code review is our opportunity to share knowledge, design ideas and make friends.
When reviewing a patch try to keep each of these concepts in mind:
Architecture¶
- Is the proposed change being made in the correct place?
Intent¶
- What is the change being proposed?
- Do we want this feature or is the bug they’re fixing really a bug?
Implementation¶
- Does the change do what the author claims?
- Are there sufficient tests?
- Should and has it been documented?
- Will this change introduce new bugs?
Grammar and style¶
These are small things that are not caught by the automated style checkers.
- Does a variable need a better name?
- Should this be a keyword argument?
Merge requirements¶
- Patches must never be pushed directly to
master
, all changes (even the most trivial typo fixes!) must be submitted as a pull request. - A patch that breaks tests, or introduces regressions by changing or removing
existing tests should not be merged. Tests must always be passing on
master
. - If somehow the tests get into a failing state on
master
(such as by a backwards incompatible release of a dependency) no pull requests may be merged until this is rectified. - All merged patches must have 100% test coverage.
- All user facing strings must be marked for translation and the
.pot
and.po
files must be updated.
Application Structure¶
Note: this is a brain dump and its contents may be moved to a more appropriate location eventually.
At the moment it just lists the legacy structure and none of the intended new structure.
The following documents the current URLs in the legacy PyPI application.
URL | Purpose |
/ | Redirect to /pypi |
/pypi | Legacy PyPI application. See below. |
/daytime | Legacy mirroring support |
/security | Page giving contact and other information regarding site security |
/id | OpenID endpoint |
/oauth | OAuth endpoint |
/simple | Simple API as given in Legacy API |
/packages | Serve up a package file |
/mirrors | Page listing legacy mirrors (not to be retained) |
/serversig | Legacy mirroring support (no-one uses it: not to be retained) |
/raw-packages | nginx implementation specific hackery (entirely internal; not to be retained) |
/stats | Web stats. Whatever. Probably dead. |
/local-stats | Package download stats. All the legacy mirrors have this. |
/static | Static files (CSS, images) in support of the web interface. |
The legacy application has a bunch of different behaviours:
- With no additional path, parameter or content-type information the app renders a “front page” for the site. TODO: keep this behaviour or redirect?
- With a content-type of “text/xml” the app runs in an XML-RPC server mode.
- With certain path information the app will render project information.
- With an :action parameter the app will take certain actions and/or display certain information.
The :action parameters are typically submitted through GET URL parameters, though some actions are also POST actions.
- could be nuked without fuss
- display was used to display a package version but was replaced ages ago by the /<package>/<version> URL structure
- all the user-based stuff like register_form, user, user_form, forgotten_password_form, login, logout, forgotten_password, password_reset, pw_reset and pw_reset_change will most likely be replaced by newer mechanisms in warehouse
- openid_endpoint, openid_decide_post could also be replaced by something else.
- home is the old home page thing and completely unnecessary
- index is overwhelming given the number of projects now.
- browse and search are probably only referenced by internal links so should be safe to nuke
- submit_pkg_info and display_pkginfo probably aren’t used
- submit_form and pkg_edit will be changing anyway
- files, urls, role, role_form are old style and will be changing
- list_classifiers .. this might actually only be used by Richard :)
- claim, openid, openid_return, dropid are legacy openid login support and will be changing
- clear_auth “clears” Basic Auth
- addkey, delkey will be changing if we even keep supporting ssh submit
- verify probably isn’t actually used by anyone
- lasthour is a pubsubhubbub thing - does this even exist any longer?
- json is never used as a :action invocation, only ever /<package>/json
- gae_file I’m pretty sure this is not necessary
- rss_regen manually regens the RSS cached files, not needed
- about No longer needed.
- delete_user No longer needed.
- exception No longer needed.
- will need to retain
- rss and packages_rss will be in a bunch of peoples` RSS readers
- doap is most likely referred to
- show_md5 ?
- can be deprecated carefully
- submit, upload, doc_upload, file_upload,
API Reference¶
PyPI’s XML-RPC methods¶
Example usage:
>>> import xmlrpclib
>>> import pprint
>>> client = xmlrpclib.ServerProxy('https://pypi.python.org/pypi')
>>> client.package_releases('roundup')
['1.4.10']
>>> pprint.pprint(client.release_urls('roundup', '1.4.10'))
[{'comment_text': '',
'downloads': 3163,
'filename': 'roundup-1.1.2.tar.gz',
'has_sig': True,
'md5_digest': '7c395da56412e263d7600fa7f0afa2e5',
'packagetype': 'sdist',
'python_version': 'source',
'size': 876455,
'upload_time': <DateTime '20060427T06:22:35' at 912fecc>,
'url': 'https://pypi.python.org/packages/source/r/roundup/roundup-1.1.2.tar.gz'},
{'comment_text': '',
'downloads': 2067,
'filename': 'roundup-1.1.2.win32.exe',
'has_sig': True,
'md5_digest': '983d565b0b87f83f1b6460e54554a845',
'packagetype': 'bdist_wininst',
'python_version': 'any',
'size': 614270,
'upload_time': <DateTime '20060427T06:26:04' at 912fdec>,
'url': 'https://pypi.python.org/packages/any/r/roundup/roundup-1.1.2.win32.exe'}]
Changes to Legacy API¶
package_releases
The show_hidden flag is now ignored. All versions are
returned.
release_data
The stable_version flag is always an empty string. It was
never fully supported anyway.
Package Querying¶
list_packages()
- Retrieve a list of the package names registered with the package index. Returns a list of name strings.
package_releases(package_name, show_hidden=False)
Retrieve a list of the releases registered for the given package_name, ordered by version.
The show_hidden flag is now ignored. All versions are returned.
package_roles(package_name)
- Retrieve a list of [role, user] for a given package_name. Role is either Maintainer or Owner.
user_packages(user)
- Retrieve a list of [role, package_name] for a given user. Role is either Maintainer or Owner.
release_downloads(package_name, release_version)
- Retrieve a list of [filename, download_count] for a given package_name and release_version.
release_urls(package_name, release_version)
Retrieve a list of download URLs for the given release_version. Returns a list of dicts with the following keys:
- url
- packagetype (‘sdist’, ‘bdist_wheel’, etc)
- filename
- size
- md5_digest
- downloads
- has_sig
- python_version (required version, or ‘source’, or ‘any’)
- comment_text
release_data(package_name, release_version)
Retrieve metadata describing a specific release_version. Returns a dict with keys for:
- name
- version
- stable_version (always an empty string)
- author
- author_email
- maintainer
- maintainer_email
- home_page
- license
- summary
- description
- keywords
- platform
- download_url
- classifiers (list of classifier strings)
- requires
- requires_dist
- provides
- provides_dist
- requires_external
- requires_python
- obsoletes
- obsoletes_dist
- project_url
- docs_url (URL of the packages.python.org docs if they’ve been supplied)
If the release does not exist, an empty dictionary is returned.
search(spec[, operator])
Search the package database using the indicated search spec.
The spec may include any of the keywords described in the above list (except ‘stable_version’ and ‘classifiers’), for example: {‘description’: ‘spam’} will search description fields. Within the spec, a field’s value can be a string or a list of strings (the values within the list are combined with an OR), for example: {‘name’: [‘foo’, ‘bar’]}. Valid keys for the spec dict are listed here. Invalid keys are ignored:
- name
- version
- author
- author_email
- maintainer
- maintainer_email
- home_page
- license
- summary
- description
- keywords
- platform
- download_url
Arguments for different fields are combined using either “and” (the default) or “or”. Example: search({‘name’: ‘foo’, ‘description’: ‘bar’}, ‘or’). The results are returned as a list of dicts {‘name’: package name, ‘version’: package release version, ‘summary’: package release summary}
browse(classifiers)
- Retrieve a list of [name, version] of all releases classified with all of the given classifiers. classifiers must be a list of Trove classifier strings.
top_packages([number])
- Retrieve the sorted list of packages ranked by number of downloads. Optionally limit the list to the number given.
updated_releases(since)
- Retrieve a list of package releases made since the given timestamp. The releases will be listed in descending release date.
changed_packages(since)
- Retrieve a list of package names where those packages have been changed since the given timestamp. The packages will be listed in descending date of most recent change.
Mirroring Support¶
changelog(since, with_ids=False)
- Retrieve a list of [name, version, timestamp, action], or [name, version, timestamp, action, id] if with_ids=True, since the given since. All since timestamps are UTC values. The argument is a UTC integer seconds since the epoch.
changelog_last_serial()
- Retrieve the last event’s serial id.
changelog_since_serial(since_serial)
- Retrieve a list of (name, version, timestamp, action, serial) since the event identified by the given since_serial All timestamps are UTC values. The argument is a UTC integer seconds since the epoch.
list_packages_with_serial()
- Retrieve a dictionary mapping package names to the last serial for each package.
Legacy API¶
Simple Project API¶
-
GET
/simple/
¶ All of the projects that have been registered. All responses MUST have a
<meta name="api-version" value="2" />
tag where the only valid value is2
.Example request:
GET /simple/ HTTP/1.1 Host: pypi.python.org Accept: text/html
Example response:
HTTP/1.0 200 OK Content-Type: text/html; charset=utf-8 X-PyPI-Last-Serial: 871501 <!DOCTYPE html> <html> <head> <title>Simple Index</title> <meta name="api-version" value="2" /> </head> <body> <!-- More projects... --> <a href="/simple/warehouse/">warehouse</a> <!-- ...More projects --> </body> </html>
Response Headers: - X-PyPI-Last-Serial – The most recent serial id number for any project.
Status Codes: - 200 OK – no error
-
GET
/simple/<project>/
¶ Get all of the URLS for the
project
. The project is matched case insensitively with the_
and-
characters considered equal. All responses MUST have a<meta name="api-version" value="2" />
tag where the only valid value is2
. The URLs returned by this API are classified by theirrel
attribute.rel name value internal Packages hosted by this repository, MUST be a direct package link. homepage The homepage of the project, MAY be a direct package link and MAY be fetched and processed for more direct package links. download The download url for the project, MAY be a direct package link and MAY be fetched and processed for more direct package links. ext-homepage The homepage of the project, MUST not be fetched to look for more packages, MAY be a direct link. ext-download The download url for the project, MUST not be fetched to look for more packages but MAY be a direct package link. external An externally hosted url, MUST not be fetched to look for more packages but MAY be a direct package link. The links may optionally include a hash using the url fragment. This fragment is in the form of
#<hashname>=<hexdigest>
. If present the downloaded file MUST be verified against that hash value. Valid hash values aremd5
,sha1
,sha224
,sha256
,sha384
, andsha512
.Example request:
GET /simple/warehouse/ HTTP/1.1 Host: pypi.python.org Accept: text/html
Example response:
HTTP/1.0 200 OK Content-Type: text/html; charset=utf-8 X-PyPI-Last-Serial: 867465 <!DOCTYPE html> <html> <head> <title>Links for warehouse</title> <meta name="api-version" value="2" /> </head> <body> <h1>Links for warehouse</h1> <a rel="internal" href="../../packages/source/w/warehouse/warehouse-13.9.1.tar.gz#md5=f7f467ab87637b4ba25e462696dfc3b4">warehouse-13.9.1.tar.gz</a> <a rel="internal" href="../../packages/3.3/w/warehouse/warehouse-13.9.1-py2.py3-none-any.whl#md5=d105995d0b3dc91f938c308a23426689">warehouse-13.9.1-py2.py3-none-any.whl</a> <a rel="internal" href="../../packages/source/w/warehouse/warehouse-13.9.0.tar.gz#md5=b39322c1e6af3dda210d75cf65a14f4c">warehouse-13.9.0.tar.gz</a> <a rel="internal" href="../../packages/3.3/w/warehouse/warehouse-13.9.0-py2.py3-none-any.whl#md5=8767c0ed961ee7bc9e5e157998cd2b40">warehouse-13.9.0-py2.py3-none-any.whl</a> </body> </html>
Response Headers: - X-PyPI-Last-Serial – The most recent serial id number for the project.
Status Codes: - 200 OK – no error
UI Principles¶
The Warehouse UI aims to be clean, clear and easy to use. Changes and additions to the UI should follow these four principles:
1. Be Consistent¶
Creating consistent interfaces is more aesthetically pleasing, improves usability and helps new users master the UI faster.
Before creating a new design, layout or CSS style, always consider reusing an existing pattern. This may include modifying an existing design or layout to make it more generic.
Following this principle can also help to reduce the footprint of our frontend code, which will make Warehouse easier to maintain in the long term.
2. Consider Usability and Accessibility¶
Ensuring Warehouse follows usability and accessibility best practices will make the site easier to use for everybody. At a minimum:
- Ensure contrast is high, particularly on text. This can be checked:
- On Chrome by installing Accessibility Developer Tools
- On Firefox by installing the WCAG Contrast Checker
- Write semantic HTML
- Ensure image alt tags are present and meaningful
- Add labels to all form fields (if you want to hide a label visually but leave
it readable to screen readers, apply
.sr-only
) - Where possible add ARIA roles to the HTML
- Indicate the state of individual UI components with CSS styles. For example, darken a button on hover.
- Ensure that keyboard users can easily navigate each page. It is particularly
important that the
outline
is not removed from links. - Consider color blind users: if using color to convey meaning (e.g. red for an error) always use an additional indicator (e.g. an appropriate icon) to convey the same meaning.
3. Provide Help¶
Never assume that all Warehouse users are as familiar with the Python ecosystem as you are. Something that may seem obvious or second-nature to you may be a difficult or novel concept for someone else.
Seek out places in the interface where help text should be included - either as standard text on the page, or by adding a help icon (that links to help content).
4. Write Clearly, with Consistent Style and Terminology¶
Warehouse follows the Material design writing style guide.
When writing interfaces use direct, clear and simple language. This is especially important as Warehouse caters to an international audience with varying proficiency in English. If you’re unsure, please check the readability of your text.
Be consistent, particularly when it comes to domain specific words. Use this glossary as a guide:
Term | Definition |
---|---|
Project | A collection of releases and files, and information about them. Projects on Warehouse are made and shared by members of the Python community so others can use them. |
Release | A specific version of a project. For example, the requests project has many releases, like requests 2.10 and requests 1.2.1. A release consists of one or more files. |
File | Something that you can download and install. Because of different hardware, operating systems, and file formats, a release may have several files, like an archive containing source code or a binary wheel. |
Package | A synonym for a file. |
User | A person who has registered an account on Warehouse. |
Maintainer | A user who has permissions to manage a project on Warehouse. |
Owner | A user who has permissions to manage a project on Warehouse, and has additional permission to add and remove other maintainers and owners to a project. |
Author | A free-form piece of information associated with a project. This information could be a name of a person, an organization, or something else altogether. This information is not linked to a user on Warehouse. |
Security¶
To read the most up to date version of our security policy, please visit the application security page, available via the site footer.
Warehouse is a new code base that implements a Python package repository. It is being actively developed and the plan is that it will eventually power PyPI and replace an older code base that is currently powering PyPI. You can see Warehouse in production at https://pypi.org/
The goal is to improve PyPI by making it:
- be more user-friendly
- have a more modern look
- more features
- remove legacy APIs
- have more maintainable code with test coverage, docs, etc.