Uber
Robert

Technology blog about stuff I love!

By default libvirt/kvm virtual machines created with Foreman will not be set to autostart. This is a slight irritation in that if the host ever crashes the virtual's won't start back up automatically. So here's the solution!

I first tried to solve this within Foreman's rails code, but this turned troublesome quite quickly. Foreman uses Fog to manage VMs. Fog's implementation of Libvirt though will not let you save a VM that has already been created, giving an error that doing so could result in duplicate machines being created. How irritating! Compounding this is a bug that a virtual machine created with :autostart => true in its initial hash of settings won't get created with autostart enabled. I'm guessing this is a libvirt issue in that for a new virtual machine there isn't an XML that can be symlinked to the autostart directory on the host. The whole libvirt symlink for autostart solution itself is quite dubious, you'd think it'd be included in the XML file directly... Double boo!

To solve this issue we'll use the foreman_hooks plugin to talk directly to libvirt and set the vm to autostart. So first setup foreman_hooks based on Foreman's directions. Make sure you add the hook_functions.sh script into and the following script into /usr/share/foreman/config/hooks/host/managed/create/

#!/bin/bash
# Autostart Libvirt VM's created with Foreman
# /usr/share/foreman/config/hooks/host/managed/create/10_autostart_libvirt.sh
# Source: http://www.uberobert.com/autostart-libvirt-vms-in-foreman/

. $(dirname $0)/hook_functions.sh

username='admin'
password='changeme'

# event name (create, before_destroy etc.)
# orchestration hooks must obey this to support rollbacks (create/update/destroy)
event=${HOOK_EVENT}

# to_s representation of the object, e.g. host's fqdn
object=${HOOK_OBJECT}
hostname=$(hook_data host.name)

echo "$(date): setting ${object} to autostart" >> /tmp/hook.log

# find our compute_resource
compute_resource=$(hook_data host.compute_resource_id)

# get compute_resource's url from foreman
compute_resource_url=$(curl -k -u ${username}:${password} https://$(hostname)/api/compute_resources/${compute_resource} | grep -Po '"url":.*?[^\\]",' | awk -F\" '{print $4}') >> /tmp/hook.log

# autostart our host
virsh -c $compute_resource_url autostart $hostname

exit 0

Note that you need to add a username and password that has API access to compute resources. Foreman doesn't pass the compute resource URL info into the script via the host's object so we need to pull it manually.

And that should be it, have fun! If you have any issues please comment below. I've made a gist of this on github, if you have any recommendations leave a comment and make recommendations on the gist. Thanks!

In the last blog I wrote I detailed how to send backups directly into S3. You might want to send backups to S3 rather than doing snapshots of your block devices in EC2 so that you can later download those backups and keep some form of your data in house. In this blog detail a basic script that does just that.

Now in this script I don't really care to download all the backups in my S3 bucket, merely the most recent. So what I'll do is parse the files in the bucket looking for the newest and then download that file.

Once again, to use these scripts first you must install the aws-sdk ruby gem from EPEL.

# Install EPEL
wget http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
wget http://rpms.famillecollet.com/enterprise/remi-release-6.rpm
sudo rpm -Uvh remi-release-6*.rpm epel-release-6*.rpm

# Install Ruby
yum install rubygems ruby rubygem-nokogiri rubygem-aws-sdk

Next, here's the actual backup script.

#!/usr/bin/env ruby
# Written by Robert Birnie
# Source: http://www.uberobert.com/download-s3-backups/

require 'rubygems'
require 'aws-sdk'

AWS.config(
  :access_key_id => '*** Provide your access key ***',
  :secret_access_key => '*** Provide your secret key ***'
)

# Backup Settings

mysql_path = '/backups/web/mysql'
mysql_bucket = 'mysql-backups'

www_path = '/backups/web/www'
www_bucket = 'www-backups'


# Logging

# Monkey patch logger to remove header, else this would signal an alert.
class Logger::LogDevice
  def add_log_header(file)
  end
end

# Remove old log
File.delete('/var/log/s3_backuperr.log')

# Create new log
@log = Logger.new('/var/log/s3_backuperr.log')
@log.level = Logger::WARN

def s3_download(file_name, base, bucket)
  # Get an instance of the S3 interface.
  s3 = AWS::S3.new

  # Upload backup file.
  key = File.basename(file_name)

  puts "Downloading file #{file_name} from bucket #{bucket}."
  File.open("#{base}/#{file_name}", 'wb') do |file|
    s3.buckets[bucket].objects[key].read do |chunk|
      file.write(chunk)
    end
  end
end

def newest_file(bucket_name)
  files = Hash.new

  s3 = AWS::S3.new
  bucket = s3.buckets[bucket_name]

  bucket.objects.each do |obj|
    files[obj.last_modified] = obj.key
  end

  files.max[1]
end

def backup(bucket, path)
  begin

  file = newest_file(bucket)
  unless file.empty? or File.exists? "#{path}/#{file}"
    puts "downloading #{file}"
    s3_download(file, path, bucket)
  end

  rescue Exception => e
    @log.error "Issue with backups from #{bucket}"
    @log.error e
    raise e
  end
end

backup(mysql_bucket, mysql_path)
backup(www_bucket, www_path)

And that should be about it!

Edit Mar 27, 2014

I rewrote some of the script to add some logging. This log is watched in an icinga/nagios check to see how backups are going. I use the mtime of the file to tell if backups are running, and any text in the file I count as an error. Hence why I delete the old file each run.

This blog post will go over the basics on getting automatic backups going from an AWS EC2 instance into an AWS S3 bucket. Storing your backups in S3 is a nice method because you get such good network performance keeping the data in AWS and then you can do a local backup to from the S3 data without effecting server performance or opening up any extra ports from their firewalls.

For this backup I needed to backup a WordPress site, so its both the local filesystem as well as the mysql database. I kept the scripts separate so that they can work if the mysql database is on a different server from the web host. The basis of this script is from the AWS Documentation

To use these scripts first you must install the aws-sdk ruby gem from EPEL

# Install EPEL
wget http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
wget http://rpms.famillecollet.com/enterprise/remi-release-6.rpm
sudo rpm -Uvh remi-release-6*.rpm epel-release-6*.rpm

# Install Ruby
yum install rubygems ruby rubygem-nokogiri rubygem-aws-sdk

Next, here's the actual backup script. Ideally I wouldn't shell out quite so often, but this works so I'm using it for now.

#!/usr/bin/env ruby
#
# Script to backup mysql into an S3 bucket
#/usr/local/sbin/mysql_backups.rb

require 'rubygems'
require 'aws-sdk'

AWS.config(
  :access_key_id => '*** Provide your access key ***',
  :secret_access_key => '*** Provide your secret key ***'
)

username='*** Provide mysql user ***'
password='*** Provide mysql password ***'
ERR_LOG="/var/log/mysql_backuperr.log"

bucket_name = 'mysql-backups'
file_name = "/tmp/#{`date +\%Y\%m\%d`.strip}.mysql.gz"

if `mysql -u #{username} -p#{password}  -e ";"`
  # mysqldump database
  if `/usr/bin/mysqldump --user=#{username} --password=#{password} --max_allowed_packet=1024M --opt --single-transaction --all-databases 2>>#{ERR_LOG} | gzip -c > #{file_name} || logger -t mysql_backups -p local6.err errorexit $?`
    # Get an instance of the S3 interface.
    s3 = AWS::S3.new
    File.size(file_name)
    # Upload backup file.
    key = File.basename(file_name)
    s3.buckets[bucket_name].objects[key].write(:file => file_name)
    puts "Uploading file #{file_name} to bucket #{bucket_name}."
  end
else
  `mysql -u #{username} -p#{password} -e ";" 2>#{ERR_LOG}`
end

Next here's the script to backup the local filesystem.

#!/usr/bin/env ruby
#
# Script to backup folder into an S3 bucket
# /usr/local/sbin/www_backup.rb

require 'rubygems'
require 'aws-sdk'

AWS.config(
  :access_key_id => '*** Provide your access key ***',
  :secret_access_key => '*** Provide your secret key ***'
)

bucket_name = 'www-backups'
file_name = "/tmp/www_#{`date +\%Y\%m\%d`.strip}.tar.gz"

`tar -zcvf #{file_name} /var/www/vhosts`

# Get an instance of the S3 interface.
s3 = AWS::S3.new

# Upload backup file.
key = File.basename(file_name)

puts "Uploading file #{file_name} to bucket #{bucket_name}."
s3.buckets[bucket_name].objects[key].write(:file => file_name)

puts "Removing local copy"
File.delete(file_name)

There are a couple to-do items to improve these scripts that I've not had a chance to test out and use. The first is using JS3tream to stream the backups to S3, this would remove the need of making a local copy of the files prior to uploading to S3. The next improvement would be enabling encryption of the files in the S3 bucket.

Next blog post I'll share how to now download the backups locally so you won't loose data when AWS is bombed by evil aliens attempting to get at your blog.

In an environment we often have files or configurations that we want to have on every node but then customize on specific nodes. For example a customized Ganglia gmond.conf file with differnt ports based on which cluster the machine belongs. The most basic method of doing this is with a giant case statement within the configuration file, but this gets unweildy at scale. A great solution for this is a resource collector. Resource collectors let you do a "find and replace" on an already defined resource. In the ganglia example it will let us define the gmond file once with default cluster settings, and then override it's attributes for any node with a more specific cluster.

All posts