One of the most important parts of a development process is an application deployment. There are many tools developed to make this process easy and painless: from the simple inploy to a complex all-in-one chef-based solutions. My tool of choice is Capistrano, simple and incredibly flexible piece of software. Today I’m going to talk about some advanced Capistrano usage scenarios.
1. Graceful Passenger restarts
Passenger user guide contains a simple Capistrano recipe for application server restarts. It works pretty well in almost all the cases, but there is a huge problem when you use a multi-server setup: it restarts all Passengers at the same time, so all client requests will hang (or even drop) during the time needed to start your application. The simplest solution is to restart Passengers one by one with some shift in time (for example, 15 seconds — choose this value based on how long it take to get your application up and running), so at any given moment only one of your application servers will be unavailable. In this case Haproxy (you use it, don’t you?) won’t send any requests to the restarting server, and most of your users will continue their work without any troubles.
Let me show you how we could achieve this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | namespace :deploy do desc <<-EOF Graceful passengers restarts. By default, it restarts \ passengers on servers with a 15 interval, but \ this delay could be changed with the smart_restart_delay \ variable (in seconds). If you specify 0, the restart will be \ performed on all your servers immediately. cap production deploy:smart_restart Yet another way to restart passenger immediately everywhere is \ to specify NOW environment variable: NOW=1 cap production deploy:smart_restart EOF task :smart_restart, :roles => :app do delay = fetch(:smart_restart_delay, 15).to_i delay = 0 if ENV['NOW'] if delay <= 0 logger.debug "Restarting passenger" run "touch #{shared_path}/restart.txt" else logger.debug "Greaseful passengers restart with #{delay} seconds delay" parallel(:roles => :app, :pty => true, :shell => false) do |session| find_servers(:roles => :app).each_with_index do |server, idx| # Calculating restart delay for this server sleep_time = idx * delay time_window = sleep_time > 0 ? "after #{sleep_time} seconds delay" : 'immediately' # Restart command sleeps a given number of seconds and the touches the restart.txt file touch_cmd = sleep_time > 0 ? "sleep #{sleep_time} && " : '' touch_cmd << "touch #{shared_path}/restart.txt && echo [`date`] Restarted Passenger #{time_window}" restart_cmd = "nohup sh -c '(#{touch_cmd}) &' 2>&1 >> #{current_release}/log/restart.log" # Run restart command on a given server session.when "server.host == '#{server.host}'", restart_cmd end end end end end |
The trickiest part is at the lines 25-26. There we use the parallel
method to run all our commands in parallel, but it has a great limitation: there is no way to substitute command parts on the fly based on server where the command is going to be executed. So instead we are building a condition for each server in the :app
role, and calculate time shift based on its index.
Sometimes it’s necessary to perform an immediate restart (for example, a database migration breaks old code). We use an environment variable to do this: cap production deploy:restart NOW=1
2. Generating deployment stages on the fly in multi-stage environments
In Scribd we use a single QA box for testing, with multiple configured applications on it. The only difference between corresponding deployment scripts is an application path (e.g. /var/www/apps/qa/01, /var/www/apps/qa/02, etc.) So how do we keep them DRY? First we have created a single deployment stage called qa, and deployed with cap qa deploy QAID=1
. Works, but smells bad. Today’s version is much more elegant, but it took some effort to implement:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | (1..10).each do |idx| qid = '%02d' % idx name = "qa#{qid}" stages << name desc "Set the target stage to `#{name}'." task(name) do location = fetch(:stage_dir, "config/deploy") set :stage, :qa set :qa_id, qid load "#{location}/qa" end end # This is a tricky part. We need to re-define [cci]multistage:ensure[/cci] callback # (which is simply raises an exception), so it will not be executed for our newly # defined stages. if callbacks[:start] idx = callbacks[:start].index { |callback| callback.source == 'multistage:ensure' } callbacks[:start].delete_at(idx) on :start, 'multistage:ensure', :except => stages + ['multistage:prepare'] end |
In the qa stage script we set the :deploy_to
variable from :qa_id
. Now we can deploy using cap qa01 deploy
. I leave the implementation of cap qa deploy
, which selects a free QA box and then performs deploy there, up to you (check the Hint 4: Deploy locks explaining how to prevent stealing QA boxes by overwriting deployments using a simple locks technique).
3. Campfire notifications
This is the most straightforward and easy to implement feature:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | begin gem 'tinder', '>= 1.4.0' require 'tinder' rescue Gem::LoadError => e puts "Load error: #{e}" abort "Please update tinder, your version is out of date: 'gem install tinder -v 1.4.0'" end namespace :campfire do desc "Send a message to the campfire chat room" task :snitch do campfire = Tinder::Campfire.new 'SUBDOMAIN', :ssl => true, :token => 'YOUR_TOKEN' room = campfire.find_room_by_name 'YOUR ROOM' snitch_message = fetch(:snitch_message) { ENV['MESSAGE'] || abort('Capfire snitch message is missing. Use set :snitch_message, "Your message"') } room.speak(snitch_message) end desc "Send a message to the campfire chat room about the deploy start" task :snitch_begin do set :snitch_message, "BEGIN DEPLOY [#{stage.upcase}]: #{ENV['USER']}, #{branch}/#{real_revision[0, 7]} to #{deploy_to}" snitch end desc "Send a message to the campfire chat room about the deploy end" task :snitch_end do set :snitch_message, "END DEPLOY [#{stage.upcase}]: #{ENV['USER']}, #{branch}/#{real_revision[0, 7]} to #{deploy_to}" snitch end desc "Send a message to the campfire chat roob about the rollback" task :snitch_rollback do set :snitch_message, "ROLLBACK [#{stage.upcase}]: #{ENV['USER']}, #{latest_revision[0, 7]} to #{previous_revision[0, 7]} on #{deploy_to}" snitch end end ############################################################# # Hooks ############################################################# before :deploy do campfire.snitch_begin unless ENV['QUIET'].to_i > 0 end after :deploy do campfire.snitch_end unless ENV['QUIET'].to_i > 0 end before 'deploy:rollback', 'campfire:snitch_rollback' |
To deploy without notifications use cap production deploy QUIET=1
(but be careful, usually it’s not a good idea).
4. Deploy locks
Sometimes it’s useful to lock deploys to a specific stage. The most common reason is that you pushed a heavy migration to the master and want to run it yourself, before the actual deploy, or performing some production servers maintenance and want to be sure nobody will interfere with your work.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | namespace :deploy do desc "Prevent other people from deploying to this environment" task :lock, :roles => :web do check_lock msg = ENV['MESSAGE'] || ENV['MSG'] || fetch(:lock_message, 'Default lock message. Use MSG=msg to customize it') timestamp = Time.now.strftime("%m/%d/%Y %H:%M:%S %Z") lock_message = "Deploys locked by #{ENV['USER']} at #{timestamp}: #{msg}" put lock_message, "#{shared_path}/system/lock.txt", :mode => 0644 end desc "Check if deploys are OK here or if someone has locked down deploys" task :check_lock, :roles => :web do # We use echo in the end to reset exit code when lock file is missing # (without it deployment will fail on this command — not exactly what we expected) data = capture("cat #{shared_path}/system/lock.txt 2>/dev/null;echo").to_s.strip if data != '' and !(data =~ /^Deploys locked by #{ENV['USER']}/) logger.info "\e[0;31;1mATTENTION:\e[0m #{data}" if ENV['FORCE'] logger.info "\e[0;33;1mWARNING:\e[0m You have forced the deploy" else abort 'Deploys are locked on this machine' end end end desc "Remove the deploy lock" task :unlock, :roles => :web do run "rm -f #{shared_path}/system/lock.txt" end end before :deploy, :roles => :web do deploy.check_lock end |
Now use can use cap production deploy:lock MSG="Running heavy migrations"
.
5. Generating servers list on the fly
Another interesting and sometimes pretty useful task is to fetch the list of servers for a deploy from some external service. For example, you have an application cloud, and do not want to change your deployment script every time you add, remove, or disable a node. Well, I have a good news for you: it’s easy!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | namespace :deploy do task :set_nodes_from_remote_resource do # Here you will fetch the list of servers from somewhere nodes = %w(app01 app02 app03) # Clear servers lists of :app and :db roles roles[:app].clear roles[:db].clear # Fill :app role servers lists nodes.each do |node| parent.role :app, node end # First server in list is a primary node and db node (to run migrations) primary = roles[:app].first primary.options[:primary] = true roles[:db].push(primary) # Show information in log about where we are going to deploy to nodes_to_deploy = roles[:app].servers.map do |server| opts = server.options[:primary] ? ' (primary, db)' : '' "#{server.host}#{opts}" end.join(', ') logger.info "Deploying to #{nodes_to_deploy}" end end on :start, 'deploy:set_nodes_from_remote_resource' |
When you run cap production deploy
, something like this will be printed to your console:
1 2 3 | triggering start callbacks for `deploy' * executing `deploy:set_nodes_from_remote_resource' ** Deploying to app01 (primary, db), app02, app03 |
That’s all for today. Deployment automation could be a really tricky task, but with a right tool it turns out to be a pleasure. Do you have any questions, suggestions, or some other example deployment recipes? Do me a favor, put them in a comment! Also I have (surprise!) a Twitter account @kpumuk, and you simply must follow me there. No excuses!
That is a dirty line of code – I think you want to check if the file exists first, and do an exit 1; at the end. Something similar to this:
I have not tested, but that should get you just about what you were going for.
There is no reason to fail if the file is not exists. It is not a failure situation, so it’s totally ok to treat both missing file and empty file as no-lock.
Thank you very much for item #5 – generating server lists on the fly. This was a huge help to me!
The only change I needed to make was to add :web role node creation inside the loop that already creates :app nodes.