The What
Here at svn update && bin/supervisorctl restart
Heroku has been great to us. Especially their integration with GitHub and Travis CI is really handy if you want to do Continuous Delivery. Basically, you tell Heroku to deploy your app every time a new commit is pushed to master on GitHub, but only if all tests on Travis CI pass. This workflow allows us to do several low-friction deploys to production every day, which brings fixes and features to our customers at lightning speed. The other upside is that, from the developer’s perspective, it’s motivating to see your code being used hours instead of weeks after you finish writing it.
The main pain-point, however, has been a proper staging setup. For any non-trivial change, the one reviewing a Pull Request would have to pull the code locally and run it against a local database version to see if the Pull Request is OK to merge. This becomes a mundane task quite fast. Or is an impossible one for people outside of the development team (think managers, designers, support staff, etc.) so non-developers had to rely on screenshots of new features/fixes that developers often forget to attach to Pull Requests.
Review Apps to the rescue
Luckily, Heroku has a feature called Review Apps. In a nutshell, if you use Heroku’s GitHub integration, you can tell Heroku that for every new Pull Request, you want an auto-created and auto-deployed, completely separate new app, containing code from the Pull Request. This allows every stakeholder to see how the changes proposed in the Pull Request will actually look when deployed.
But what about data?
Heroku advises against copying production data into a review app and this makes a lot of sense, since the dataset is probably large and might take a while to copy, plus it does not fit on the limited (free) resources of a Review App. Instead, they recommend populating the Review App’s DB with dummy data.
But that’s also not ideal. Dummy data often lags behind real data, and it’s often pointless to test DB migrations against dummy data.
We decide to take the middle ground approach: copy over a subset of production data to the Review App’s DB using
Testing database migrations
Because we copy over the subset of production data to the Review App’s DB, we can also test alembic migration in each PR’s Review App. This prevents “whoops” moments when all tests pass, but the code you push into production includes a database migration that you forgot to test against real data.
The How
Procfile
The .heroku/release.sh
[ ... snip ...]
release: .heroku/release.sh
app.json
The app.json file tells Heroku how to prepare the Review App. Besides project specifics, our app.json files contain the following keys:
{
"scripts": {
# only ran once, after Review App is created
"postdeploy": ".heroku/reviewappdb.sh"
},
"env": {
"REVIEWAPP": "true",
# production DB string
"HEROKU_POSTGRESQL_RED_URL": {
"required": true
}
"addons": [
"heroku-postgresql",
],
[ ... snip ... ]
}
.heroku/release.sh
This is the script that Procfile
#!/usr/bin/env bash
set -e
# Review apps first need to copy over schema/data from production and only then run alembic. See reviewappdb.sh
if [ -z "$REVIEWAPP" ]; then
echo "Running alembic migrations"
python -m pyramid_heroku.migrate myapp etc/production.ini app:myapp
echo "DONE!"
else
# Release stage in a review app.
# Let's run migrations and check if anything's broken. But skip this step
# if this is the first time the review app is released, because in this
# case the migrations will happen in the postdeploy script.
# check if first time, do this by checking if migrations table is set up
COUNT=$(psql $DATABASE_URL -t -c "SELECT 1 FROM information_schema.tables WHERE table_name = 'alembic_version'" | xargs)
if [ "$COUNT" = "1" ]; then
# Not the first time a release is run on this app, so let's run the migrations.
echo "Release on existing app, doing migrations"
python -m pyramid_heroku.migrate myapp etc/production.ini app:myapp || echo "Database migrations failed in release.sh"
else
printf "This is the first run of release.sh in a review app which is ran "
printf "before postdeploy script, so it should not do anything. Postdeploy "
printf "script will take care of migrations and data setup."
fi
fi
.heroku/reviewapp.sh
This is the script app.json
postdeploy
!/usr/bin/env bash
# Prepare the Review App's db on Heroku.
# The majority of work goes into preparing a subset of production data. We take
# the latest backup of the production DB and userdbms-subsetter
to insert
# only a fraction of data into the review app's database.
# The HEROKU_POSTGRESQL_PINK is a read-only follower of production app's primary
echo "Creating a snapshot of Production DB's schema"
pg_dump $HEROKU_POSTGRESQL_PINK_URL --schema-only --no-owner --no-acl \
| sed 's/COMMENT ON EXTENSION/-- COMMENT ON EXTENSION/g' > schema.dump
echo "Importing production schema into Review App DB"
psql $DATABASE_URL -f schema.dump
echo "Starting data migration"
rdbms-subsetter $HEROKU_POSTGRESQL_COLOR_URL $DATABASE_URL \
--logarithmic 0.2 --yes \
--force users:1 # always bring over user with ID==1
echo "Running alembic"
alembic -c etc/production.ini -n app:myapp upgrade head || echo "Database migrations failed in reviewappdb.sh"
echo "Done"