The Cloud Foundry Development Teams use a heavily-customized VMware vCenter Server Appliance (VCSA) 5.5. We needed to architect an offsite backup solution of the VCSA’s databases to avoid days of lost developer time in the event of a catastrophic failure.
This blog post describes how we configured our VCSA to backup its databases nightly to Amazon S3 (Amazon’s cloud storage service).
2014-11-23 We strongly encourage everyone to increase the size of their root filesystem before implementing the backup described here. To assist, we have written a blog post, “Increasing the Size of a VCSA Root Filesystem“. Owners of medium and large vCenter installations (> 1000 VMs) should expand the vCenter’s root filesystem to avoid exhausting the available disk space. We experienced this firsthand on 2014-10-02.
2014-09-07 this blog post has been updated:
- we modified the manner in which the backup is kicked off (
- the backup script
vcenter_db_bkup.shaccepts the S3 bucket name as an argument
The cost for offsite storage of the databases? Pennies. 
We chose S3 for several reasons:
- an S3 account had already been set up and billing procedures were in place
- there was a high degree of familiarity of S3 within the team (i.e. we use S3 to store our BOSH Stemcells and other artifacts)
- it didn’t require any additional hardware (e.g. tape drive) or for-pay software (e.g. Arkeia)
- the amount of data we were backing up was small enough to be accommodated by our Internet connection’s bandwidth (100MB would take less than a minute to upload to S3)
VCSA DB Characteristics
Our VCSA databases had the following characteristics:
- The data had a short shelf-life. We didn’t need to keep years of backups, or even months. We decided to keep 30 days, which is probably 25 more days than we needed.
- The data was volatile and needed to be backed up nightly.
- The data had a small footprint: 100MB
- We were not backing up the VCSA itself (~133GB, prohibitively large). In the event of a catastrophe, we would re-install a pristine VCSA and then restore the databases.
- We were not backing up the various VMs that were running on our environment (~8TB currently, definitely prohibitively large).
We were lucky: we didn’t have to worry about the 8TB of VMs that our vCenter was managing. We are a development shop, and almost all the VMs are brought up for testing purposes and often torn down within 24 hours. They are expendable.
1. Preparing S3
1.1 Create S3 Bucket
We need to create a bucket, and we need a unique name (per Amazon: “The bucket name you choose must be unique across all existing bucket names in Amazon S3“). We decide to use the FQDN of our vcenter server as the bucket name, i.e. vcenter.cf.nono.com. We configure the bucket to delete files older than 30 days.
- log into Amazon AWS
- click S3
- click Create Bucket
- Bucket Name: vcenter.cf.nono.com; click Create
- click Lifecycle
- click Add rule
- click Configure Rule (rule applies to whole bucket)
- select: Action on Objects: Permanently Delete Only
- Permanently Delete 30 days after the object’s creation date
- click Review
- Rule Name: Delete after 30 days; click Create and Activate Rule
1.2 Create S3 Backup User
Keeping with our theme, We decide to use the FQDN of our vcenter server as the backup user name. We limit the user’s privileges to uploading and downloading to S3.
Interestingly, it’s not the name of the user that is important; is is the credentials of the user. We make sure to download the credentials and store them in a safe place, for we will need them in the next step.
- go to Amazon Console
- click IAM (Secure AWS Access Control)
- click Groups
- click Create New Group
- Group Name: s3-uploaders; click Next Step
- Select Policy Template: Amazon S3 Full Access; click Select
- click Next Step
- click Create Group
- click Users
- click Create New Users
- Enter User Name vcenter.cf.nono.com
- click Create
- click Download Credentials; click Close
- select user vcenter.cf.nono.com
- User Actions → Add User to Groups
- select s3-uploaders; click Add to Groups
2. Configuring the VCSA for Backups
2.1 Install & Configure s3cmd
Download s3cmd and configure it with the credentials created previously. Note that we download an older version of s3tools (1.0.1 instead of the current 1.5.0-rc1), for more-recent versions require python-dateutil, and we prefer to keep our changes to the VCSA to a minimum.
cd /usr/local/sbin curl -OL http://tcpdiag.dl.sourceforge.net/project/s3tools/s3cmd/1.0.1/s3cmd-1.0.1.tar.gz tar xf s3cmd-1.0.1.tar.gz /usr/local/sbin/s3cmd-1.0.1/s3cmd --configure Access Key: AKIAxxxxxxxxxxxxx Secret Key: 5C9Gxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Encryption password: a_really_secret_password Path to GPG program [/usr/bin/gpg]: Use HTTPS protocol [No]: y ... Test access with supplied credentials? [Y/n] y Please wait... Success. Your access key and secret key worked fine :-) Now verifying that encryption works... Success. Encryption and decryption worked fine :-) ... Save settings? [y/N] y
2.2 Install, Test, & Link the Backup Script
Download the backup script:
curl -L https://raw.githubusercontent.com/cunnie/bin/master/vcenter_db_bkup.sh -o /usr/local/sbin/vcenter_db_bkup.sh chmod a+x /usr/local/sbin/vcenter_db_bkup.sh
Now let’s test it. We pass the S3 bucket name,
vcenter.cf.nono.com, as a parameter. We also run it under
xtrace enabled so that we can watch it progress (the script may take several minutes to run, and we find it reassuring to follow its progress) (the script’s normal execution is silent):
bash -x /usr/local/sbin/vcenter_db_bkup.sh vcenter.cf.nono.com
We check S3 to verify that our files were uploaded. A lightly-loaded vCenter may have a small (less than 3MB) files; a Vblock vCenter can easily have > 100MB files.
/etc/crontab kick-off time
We will use
/etc/crontab  to kick-off our backups at 3:25 a.m. PDT. We do not want our backups to occur during our work hours (9 a.m. – 6 p.m. PDT) (our continuous integration tests failed when they tried to contact the vCenter while the backup was taking place (the backup script temporarily shuts down the
vmware-vpxd and the
VMware doesn’t allow us to change the timezone in the VCSA (it’s locked to UTC), so instead we convert 3:25 a.m. PDT to UTC (i.e. 10:25 a.m.).
/etc/crontab and add the following lines:
# backup the VCSA databases to Amazon AWS S3 25 10 * * * root /usr/local/sbin/vcenter_db_bkup.sh vcenter.cf.nono.com
We check the following day to make sure that the database files were uploaded.
We don’t know if what we’re backing up is enough to restore a vCenter; we have never tried to restore a vCenter.
1 At the time of this writing, Amazon charges $0.03 per GB per month. Our current vCenter’s databases size is 191MB, and thirty copies (one each day) works out to 2.4GB, which is less than 18 cents per month. Annual cost? $2.07.
2 Originally we attempted to use a symbolic link in
/etc/cron.daily to our backup script; however, we discovered that solution to be sub-optimal: the time that our backup script was kicked off was non-deterministic, which meant it could (and did) kick off during work hours, causing disruption to our developers.
Much of this blog post was based on internal Cloud Foundry documentation.
VMware Knowledge Base has two excellent articles regarding the backup of VCSAs: