Mirroring a Drupal Website

The client's website was happily running in-house on a server that we had built, and nightly backups were being copied off-site to a secure data center in a far-off land, but they wanted to know what more we could do to provide redundancy in the event that the web server came under attack. We decided that it would be nice to have a second, completely separate web server loaded up and ready to take over at any time, should the primary server become unavailable.

I set out to build a mirror server - a running, fully functional website that can be updated with all of the changes to the main website once a day, or more frequently if desired. There are some features in Drush - the Drupal command line tool - that can help with this, but my ISP only offers Drush support with a much more expensive dedicated server hosting account, and I had promised that this could be done inexpensively. It seemed that a few BASH shell scripts should handle the task well.

It turns out that there are several problems to simply copying one Drupal site over another. The site files for a particular website live inside of an instance of Drupal core, which can serve many sites in what is called multi-site mode. If you copy over the whole thing, you risk clobbering any other websites that may be hosted on the same Drupal instance. Beside that, the configuration files that allow Drupal to run will likely be different on two different servers. Don't want to clobber those, either.

In addition to differences within each Drupal instance, the .htaccess file that I use to direct web requests to the right resource is different on each server.

We also have to deal with server-specific entries within each site database, keep versions of the database in sync with the files they expect to use on disk, and finally, be proactive about setting file permissions on files for the proper user accounts on the server.

So, we have specific settings on the server level, the Drupal core instance level, and at the site level (both files and data). To preserve these differences while importing new files and data, I built a hierarchy on disk to reflect the structure of the information: server level, Drupal instance level, and site level. A separate configuration file is used to define all of the settings for each separate Drupal site.

When the mirror script is started, whether by cron or manually, the first thing we want to do is to get a good snapshot of the current state of our website, and let the web server get back to serving site visitors. I temporarily hide the site so that we won't have an intervening web request making changes to the files or database as we make our copy, then use rsync to disassemble the site into the hierarchy on disk on the source server. We make a full database dump for the site and save it into the heirarchy. I also move the site's Backup and Migrate files to separate storage in the hierarchy so that the default Backup and Migrate file storage location can be kept free on the destination server.

At that point, we can unhide the site to users and take our time transfering the mirror files to the destination server. To make the transfer, I call a script which is run by a user with just enough permissions to get the files across. Once the transfer is completed, we set a flag on the destination server, rotate our log files and send an email to the administrator with details of the mirror run.

On the destination server, cron launches a BASH script every three hours to check the flag for new mirrored files, which have already been transferred to the local disk. If the flag is set, we are going to rebuild the destination site, and optionally its Drupal instance using the new files.

But the first thing we do is to save the local site, instance and server configuration files and certain identifying files, such as the logo and favicon, so that we will have those to add back to the site when we are done. We hide the site as before to prevent unexpected results should a visitor request a page as we are reloading, then use rsync locally to update the site and its supporting files. The database dump is then processed to reflect its new location, cache data is truncated and the dump is imported.

In less than a minute, the site comes back on line and the next site visitor can see that www2.example.com is a mirror image of www.example.com.

Add new comment