Advanced Capistrano usage

Posted by Dmytro Shteflyuk on under Development (54,271 views)

Capistrano — the dead simple deployment tool

One of the most important parts of a development process is an application deployment. There are many tools developed to make this process easy and painless: from the simple inploy to a complex all-in-one chef-based solutions. My tool of choice is Capistrano, simple and incredibly flexible piece of software. Today I’m going to talk about some advanced Capistrano usage scenarios.

1. Graceful Passenger restarts

Passenger user guide contains a simple Capistrano recipe for application server restarts. It works pretty well in almost all the cases, but there is a huge problem when you use a multi-server setup: it restarts all Passengers at the same time, so all client requests will hang (or even drop) during the time needed to start your application. The simplest solution is to restart Passengers one by one with some shift in time (for example, 15 seconds — choose this value based on how long it take to get your application up and running), so at any given moment only one of your application servers will be unavailable. In this case Haproxy (you use it, don’t you?) won’t send any requests to the restarting server, and most of your users will continue their work without any troubles.

Let me show you how we could achieve this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
namespace :deploy do
  desc <<-EOF
    Graceful passengers restarts. By default, it restarts \
    passengers on servers with a 15 interval, but \
    this delay could be changed with the smart_restart_delay \
    variable (in seconds). If you specify 0, the restart will be \
    performed on all your servers immediately.

      cap production deploy:smart_restart

    Yet another way to restart passenger immediately everywhere is \
    to specify NOW environment variable:

      NOW=1 cap production deploy:smart_restart
  EOF
  task :smart_restart, :roles => :app do
    delay = fetch(:smart_restart_delay, 15).to_i
    delay = 0 if ENV['NOW']

    if delay <= 0
      logger.debug "Restarting passenger"
      run "touch #{shared_path}/restart.txt"
    else
      logger.debug "Greaseful passengers restart with #{delay} seconds delay"
      parallel(:roles => :app, :pty => true, :shell => false) do |session|
        find_servers(:roles => :app).each_with_index do |server, idx|
          # Calculating restart delay for this server
          sleep_time = idx * delay
          time_window = sleep_time > 0 ? "after #{sleep_time} seconds delay" : 'immediately'

          # Restart command sleeps a given number of seconds and the touches the restart.txt file
          touch_cmd   = sleep_time > 0 ? "sleep #{sleep_time} && " : ''
          touch_cmd  << "touch #{shared_path}/restart.txt && echo [`date`] Restarted Passenger #{time_window}"
          restart_cmd = "nohup sh -c '(#{touch_cmd}) &' 2>&1 >> #{current_release}/log/restart.log"

          # Run restart command on a given server
          session.when "server.host == '#{server.host}'", restart_cmd
        end
      end
    end
  end
end

The trickiest part is at the lines 25-26. There we use the parallel method to run all our commands in parallel, but it has a great limitation: there is no way to substitute command parts on the fly based on server where the command is going to be executed. So instead we are building a condition for each server in the :app role, and calculate time shift based on its index.

Sometimes it’s necessary to perform an immediate restart (for example, a database migration breaks old code). We use an environment variable to do this: cap production deploy:restart NOW=1

2. Generating deployment stages on the fly in multi-stage environments

In Scribd we use a single QA box for testing, with multiple configured applications on it. The only difference between corresponding deployment scripts is an application path (e.g. /var/www/apps/qa/01, /var/www/apps/qa/02, etc.) So how do we keep them DRY? First we have created a single deployment stage called qa, and deployed with cap qa deploy QAID=1. Works, but smells bad. Today’s version is much more elegant, but it took some effort to implement:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
(1..10).each do |idx|
  qid = '%02d' % idx
  name = "qa#{qid}"
  stages << name

  desc "Set the target stage to `#{name}'."
  task(name) do
    location = fetch(:stage_dir, "config/deploy")
    set :stage, :qa
    set :qa_id, qid
    load "#{location}/qa"
  end
end
# This is a tricky part. We need to re-define [cci]multistage:ensure[/cci] callback
# (which is simply raises an exception), so it will not be executed for our newly
# defined stages.
if callbacks[:start]
  idx = callbacks[:start].index { |callback| callback.source == 'multistage:ensure' }
  callbacks[:start].delete_at(idx)
  on :start, 'multistage:ensure', :except => stages + ['multistage:prepare']
end

In the qa stage script we set the :deploy_to variable from :qa_id. Now we can deploy using cap qa01 deploy. I leave the implementation of cap qa deploy, which selects a free QA box and then performs deploy there, up to you (check the Hint 4: Deploy locks explaining how to prevent stealing QA boxes by overwriting deployments using a simple locks technique).

3. Campfire notifications

This is the most straightforward and easy to implement feature:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
begin
  gem 'tinder', '>= 1.4.0'
  require 'tinder'
rescue Gem::LoadError => e
  puts "Load error: #{e}"
  abort "Please update tinder, your version is out of date: 'gem install tinder -v 1.4.0'"
end

namespace :campfire do
  desc "Send a message to the campfire chat room"
  task :snitch do
    campfire = Tinder::Campfire.new 'SUBDOMAIN', :ssl => true, :token => 'YOUR_TOKEN'
    room = campfire.find_room_by_name 'YOUR ROOM'
    snitch_message = fetch(:snitch_message) { ENV['MESSAGE'] || abort('Capfire snitch message is missing. Use set :snitch_message, "Your message"') }
    room.speak(snitch_message)
  end

  desc "Send a message to the campfire chat room about the deploy start"
  task :snitch_begin do
    set :snitch_message, "BEGIN DEPLOY [#{stage.upcase}]: #{ENV['USER']}, #{branch}/#{real_revision[0, 7]} to #{deploy_to}"
    snitch
  end

  desc "Send a message to the campfire chat room about the deploy end"
  task :snitch_end do
    set :snitch_message, "END DEPLOY [#{stage.upcase}]: #{ENV['USER']}, #{branch}/#{real_revision[0, 7]} to #{deploy_to}"
    snitch
  end

  desc "Send a message to the campfire chat roob about the rollback"
  task :snitch_rollback do
    set :snitch_message, "ROLLBACK [#{stage.upcase}]: #{ENV['USER']}, #{latest_revision[0, 7]} to #{previous_revision[0, 7]} on #{deploy_to}"
    snitch
  end
end

#############################################################
# Hooks
#############################################################

before :deploy do
  campfire.snitch_begin unless ENV['QUIET'].to_i > 0
end

after :deploy do
  campfire.snitch_end unless ENV['QUIET'].to_i > 0
end

before 'deploy:rollback', 'campfire:snitch_rollback'

To deploy without notifications use cap production deploy QUIET=1 (but be careful, usually it’s not a good idea).

4. Deploy locks

Sometimes it’s useful to lock deploys to a specific stage. The most common reason is that you pushed a heavy migration to the master and want to run it yourself, before the actual deploy, or performing some production servers maintenance and want to be sure nobody will interfere with your work.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
namespace :deploy do
  desc "Prevent other people from deploying to this environment"
  task :lock, :roles => :web do
    check_lock
    msg = ENV['MESSAGE'] || ENV['MSG'] ||
          fetch(:lock_message, 'Default lock message. Use MSG=msg to customize it')
    timestamp = Time.now.strftime("%m/%d/%Y %H:%M:%S %Z")
    lock_message = "Deploys locked by #{ENV['USER']} at #{timestamp}: #{msg}"
    put lock_message, "#{shared_path}/system/lock.txt", :mode => 0644
  end

  desc "Check if deploys are OK here or if someone has locked down deploys"
  task :check_lock, :roles => :web do
    # We use echo in the end to reset exit code when lock file is missing
    # (without it deployment will fail on this command — not exactly what we expected)
    data = capture("cat #{shared_path}/system/lock.txt 2>/dev/null;echo").to_s.strip

    if data != '' and !(data =~ /^Deploys locked by #{ENV['USER']}/)
      logger.info "\e[0;31;1mATTENTION:\e[0m #{data}"
      if ENV['FORCE']
        logger.info "\e[0;33;1mWARNING:\e[0m You have forced the deploy"
      else
        abort 'Deploys are locked on this machine'
      end
    end
  end

  desc "Remove the deploy lock"
  task :unlock, :roles => :web do
    run "rm -f #{shared_path}/system/lock.txt"
  end
end

before :deploy, :roles => :web do
  deploy.check_lock
end

Now use can use cap production deploy:lock MSG="Running heavy migrations".

5. Generating servers list on the fly

Another interesting and sometimes pretty useful task is to fetch the list of servers for a deploy from some external service. For example, you have an application cloud, and do not want to change your deployment script every time you add, remove, or disable a node. Well, I have a good news for you: it’s easy!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
namespace :deploy do
  task :set_nodes_from_remote_resource do
    # Here you will fetch the list of servers from somewhere
    nodes = %w(app01 app02 app03)

    # Clear servers lists of :app and :db roles
    roles[:app].clear
    roles[:db].clear

    # Fill :app role servers lists
    nodes.each do |node|
      parent.role :app, node
    end

    # First server in list is a primary node and db node (to run migrations)
    primary = roles[:app].first
    primary.options[:primary] = true
    roles[:db].push(primary)

    # Show information in log about where we are going to deploy to
    nodes_to_deploy = roles[:app].servers.map do |server|
      opts = server.options[:primary] ? ' (primary, db)' : ''
      "#{server.host}#{opts}"
    end.join(', ')

    logger.info "Deploying to #{nodes_to_deploy}"
  end
end

on :start, 'deploy:set_nodes_from_remote_resource'

When you run cap production deploy, something like this will be printed to your console:

1
2
3
    triggering start callbacks for `deploy'
  * executing `deploy:set_nodes_from_remote_resource'
 ** Deploying to app01 (primary, db), app02, app03

That’s all for today. Deployment automation could be a really tricky task, but with a right tool it turns out to be a pleasure. Do you have any questions, suggestions, or some other example deployment recipes? Do me a favor, put them in a comment! Also I have (surprise!) a Twitter account @kpumuk, and you simply must follow me there. No excuses!

2 Responses to this entry

Subscribe to comments with RSS or TrackBack to 'Advanced Capistrano usage'.

michael
said on September 14, 2011 at 5:08 am · Permalink · Reply
1
data = capture("cat #{shared_path}/system/lock.txt 2>/dev/null;echo").to_s.strip

That is a dirty line of code – I think you want to check if the file exists first, and do an exit 1; at the end. Something similar to this:

1
data = capture("if [ -f #{shared_path}/system/lock.txt ]; then cat #{shared_path}/system/lock.txt; else exit 1;fi")

I have not tested, but that should get you just about what you were going for.

said on November 4, 2011 at 5:10 pm · Permalink · Reply

There is no reason to fail if the file is not exists. It is not a failure situation, so it’s totally ok to treat both missing file and empty file as no-lock.

Post a comment

You can use simple HTML-formatting tags (like <a>, <strong>, <em>, <ul>, <blockquote>, and other). To format your code sample use [cc lang="php"]$a = "hello";[/cc] (allowed languages are ruby, php, yaml, html, csharp, javascript). Also you can use [cc][/cc] block and its syntax would not be highlighted.

Submit Comment