Database migrations =================== Modifying database schemata will need database migrations (even for adding and removing tables). To autogenerate migrations:: docker compose run web python -m warehouse db revision --autogenerate --message "text" Verify your migration was generated by looking at the output from the command above: ``Generating /opt/warehouse/src/warehouse/migrations/versions/390811c1c\ dbe_.py ... done`` Then migrate and test your migration:: make runmigrations Migrations are automatically run as part of the deployment process, but prior to the old version of Warehouse from being shut down. This means that each migration *must* be compatible with the current ``main`` branch of Warehouse. This makes it more difficult to make breaking changes, since you must phase them in over time. See :ref:`destructive-migrations` for tips on doing migrations that involve column deletions or renames. To help protect against an accidentally long running migration from taking down PyPI, by default a migration will timeout if it is waiting more than 4s to acquire a lock, or if any individual statement takes more than 5s. The lock timeout helps to protect against the case where a long running app query is blocking the migration, and then the migration itself ends up blocking short running app queries that would otherwise have been able to run concurrently with the long running app query. The statement timeout helps to protect against locking the database for an extended period of time (often for data migrations). It is possible to override these values inside of a migration, to do so you can add: .. code-block:: python op.execute("SET statement_timeout = 5000") op.execute("SET lock_timeout = 4000") To your migration. For more information on what kind of operations are safe in a high availability environment like PyPI, there is related reading available at: - `PostgreSQL at Scale: Database Schema Changes Without Downtime `_ - `Move fast and migrate things: how we automated migrations in Postgres `_ - `PgHaMigrations `_ .. _destructive-migrations: Destructive migrations ---------------------- .. warning:: Read this section and its respective sub-sections **completely** before attempting to follow them! Failure to do so can result in serious deployment errors and outages. Migrations that do column renames or deletions need to be performed with special care, due to how Warehouse is deployed. Performing a migration without these steps will cause errors during deployment, and may require a full revert. .. _removing-a-column: Removing a column ================= To remove a column: 1. Perform the Python-level code changes, i.e. remove usages of the column/attribute within Warehouse itself. Do **not** generate an accompanying migration. 2. Submit the changes as a PR. Tag the PR with ``skip-db-check`` to allow it to pass CI without accompanying migrations. 3. Prepare a second PR containing just the generated migrations. 4. Merge the first PR and ensure its deployment before merging the second. This will ensure that the "old" version of Warehouse (prior to the new migration has no references to the column being deleted). Renaming a column ================= Renaming a column is more complex than deleting a column, since it involves a data migration. To rename a column: 1. Create an initial migration that adds the new column, and add code that writes to the new column while reading from both it and the old column. 2. Deploy the initial migration. 3. Prepare a second migration that performs a backfill of the old column to the new column. 4. Deploy the second migration. 5. Follow the :ref:`removing-a-column` steps *in entirety* to remove the old column. In total, this requires three separate migrations: one to add the new column, one to backfill to it, and a third to remove the old column.