10 Mar 2015, 15:52 Bluemix Continuous Delivery XP

Blue-Green Deployment

One of our core principles is to deliver stories with the highest business value from day one. To deliver something means to put it in front of our customer. Regardless of whether our customer decides to make it public or not, it is available in production from the very first cf push. We strongly believe that having a partially complete solution in production from day 1 is more valuable than having the full solution delivered later.

All Bluemix Garage projects are continuously and automatically deployed into production, regardless whether they are internal tools or software components that we develop for customers.

One of the goals with any production environment is maintaining high availability for your application. The CloudFoundry platform helps with this by giving you the ability to scale your application horizontally to prevent downtime. There is another availability challenge when continuously deploying applications - if you cf push an existing application in production, your instances must be restarted to update your code. Not only does this inevitably introduce downtime, but you are left without a fallback if something goes wrong.

Blue Green deployment is not a new concept by any means. Nor is it tied to any particular platform. In short the idea is:

  1. Deploy a new instance of your application (leaving your current “green” environment untouched).
  2. Run smoke tests to verify the new environment, before taking it live.
  3. Switch over production traffic, keeping the old environment as a hot backup.

As we are deploying straight to production, using blue-green is imperative.

The current status-quo

We discovered these two tools:

  • Buildpack Utilities from the IBM WAS team — An interactive Bash script that will take you through the process
  • Autopilot from Concourse — A plugin for cf (We really like that it’s built into the tooling)

Both tools in fact do roughly the same thing:

  1. Rename the existing app while keeping all routes in place
  2. Push the new instance, which will automatically map the correct routes to it
  3. Delete the old instance

There are three problems with this approach:

  • They do not support running tests against the new app before making it available on the production route.
  • They immediately delete the app, making it impossible to revert to the old environment if something goes wrong
  • There is a chance that the old app could be serving requests at the time it is deleted.

While Blue-Green is quite a simple concept, we discovered several nuances in how we wanted to approach the problem so we decided to build our own.

Our Take

Our continuous delivery environment already consisted of a couple of shell scripts that provide app lifecycle management on top of the cf cli. We decided to build an extension to these scripts to manage Blue-Green with this process:

  1. Push new app instance to Bluemix with a different name
  2. Run end-to-end integration tests against the new app instance using the auto-generated route
  3. Map production routes (including domain aliases) to the new app instance
  4. Unmap routes from previous app instance
  5. Clean up all older app instances, leaving only the previous instance running as a hot backup

If any step fails then the CI server immediately gives up. We don’t want to continue the deployment if anything goes wrong!

The crucial difference is that we do not touch the production app until we know we have a good instance to replace it with. This means using a new app name for every deploy — as easy as passing a specific application-name when calling cf push. We chose to append the date and time for every deploy, but you could also use the git hash or CI build number.

After the new instance has been tested — with simple smoke tests in our case, the tool will then cf map-route the production route to the application. Now the new instance is handling production traffic, cf unmap-route can be used to disconnect the old instance from the world, leaving it untouched until the next deploy.

Deploying with a new application name each time, means we always run in a clean environment. However, without cleaning up, we’ll be left with a lot of redundant, idle applications consuming resources. This is easy to deal with, we simply delete all previous instances except the new app and the instance we just unmapped.

Blue, Green, Cloud.

Arguably what we’re doing here isn’t strictly Blue-Green deployment. The name “Blue-Green” stems from there being two separate environments on physical hardware, each of which could run an instance of the application. It wasn’t possible to throw away the old hardware and start with a completely fresh system for each install so it was necessary to flip between them.

Developing for a Cloud environment such as IBM Bluemix, means that creating whole new application environments is a simple cf push away, and deleting them is just as easy. With such a cheap way to create whole new environments it just doesn’t make sense to keep flipping between two apps when you can always start afresh.

We’ve been running several applications in this configuration for months now and we’re really pleased with what we’ve built. Everyone has different needs, this works for us but it’s just a different way to approach the problem.

Update 07/09/2015

We have recently open sourced our Blue-Green deploy script. It is now available as a Cloud Foundry CLI plugin. The source code and documentation can be found here:

https://github.com/bluemixgaragelondon/cf-blue-green-deploy