Create Incremental Snapshot-style Backups With rSync And SSH

Want to support HowtoForge? Become a subscriber!
 
Submitted by sjau (Contact Author) (Forums) on Sat, 2006-08-12 11:15. :: Backup

Creating Incremental Snapshot-style Backups With rSync And SSH

Author: Stephan Jau

Revision: v1.5

Last Change: Jan 16 2007

Based upon the works of: Falko Timme <ft [at] falkotimme [dot] com> & Mike Rubel <webmaster [at] www [dot] mikerubel [dot] org>

Introduction

As neither human nor computers are perfect (humans err / computers may fail) it is quite obvious that a good backup system will prevent too much damage once the computer may go down. This could be either because the harddrive is failing, because of hackers, because you accidentally deleted something important, ...

In this tutorial I will show you how to automate backups in an incremental snapshot-style way by using rSync.

1. Setting up rSync over SSH

First of all you need a running rsync server and client that connect to each other without being required to enter a password. More suitable even to have it run through SSH (you might transfer sensitive data). For this, Falko Timme has already written an excellen howto. You can find it here Mirror Your Web Site With rsync
Since that howto is already excellent there's no point in writing another one about this subject. Follow this howto until Step 6 (6 Test rsync On mirror.example.com) and test whether your setup works.
Contrary to the previous version I will only have the mirror/backup server initialize a backup but I will also show how to make a backup on the same computer (to a different partition/disk drive/usb stick...).
Note: In my case I do backup my data to a friends server.

2. Backups without auto-deletion

In this setup I will tell you how you just keep making backups without old backups being deleted. For this setup it is mandatory, that mirror/backup server can access the production server without being prompted for a password.

Once you have ensured, that your mirror/backup server can connect to your production server without being asked for a password then all you need is a small shell script and a cronjob to actually accomplish the backup.

2. Adding the shell scripts for backup capability

In order to get our backup system running, we need 2 shell scripts. One that does all the jobs on the backup/mirror server. It will initiate the backup mechanism. I have called it backup.sh. The other one will be located on the production server. All it does is create backups from your mysql databases (if you want you can alter it to backup postgresql or some other databases. If you understand how it all works that will be pretty simple to achieve).

Place the backup.sh script somewhere on your backup/mirror server and adjust, if necessary, the user and path variables.

backup.sh (backup shell script)

#!/bin/bash
unset PATH


# USER VARIABLES
BACKUPDIR=/backup							# Folder on the backup server where the backups shall be located
KEY=/root/.ssh/id_rsa						# SSH key
MYSQL_BACKUPSCRIPT=/root/my_backup.sh		# Path to the remote mysql backup script
PRODUCTION_USER=root@production.server.com	# The user and the address of the production server
EXCLUDES=/backup/backup_exclude				# File containing the excluded directories
DAYS=60										# The number of days after which old backups will be deleted


# PATH VARIABLES
SH=/bin/sh									# Location of the bash bin in the production server!!!!

CP=/bin/cp;									# Location of the cp bin
FIND=/usr/bin/find;							# Location of the find bin
ECHO=/bin/echo;								# Location of the echo bin
MK=/bin/mkdir;								# Location of the mk bin
SSH=/usr/bin/ssh;							# Location of the ssh bin
DATE=/bin/date;								# Location of the date bin
RM=/bin/rm;									# Location of the rm bin
GREP=/bin/grep;								# Location of the grep bin
MYSQL=/usr/bin/mysql;						# Location of the mysql bin
MYSQLDUMP=/usr/bin/mysqldump;				# Location of the mysql_dump bin
RSYNC=/usr/bin/rsync;						# Location of the rsync bin
TOUCH=/bin/touch;							# Location of the touch bin



##                                                      ##
##      --       DO NOT EDIT BELOW THIS HERE     --     ##
##                                                      ##



# CREATING NECESSARY FOLDERS
$MK $BACKUPDIR
CURRENT=$BACKUPDIR/current
OLD=$BACKUPDIR/old
$MK $CURRENT
$MK $OLD
# CREATING CURRENT DATE / TIME
NOW=`$DATE '+%Y-%m'-%d_%H:%M`
NOW=$OLD/$NOW
$MK $NOW


# CREATE REMOTE MYSQL BACKUP BY RUNNING THE REMOTE BACKUP SCRIPT
$SSH -i $KEY $PRODUCTION_USER "$SH $MYSQL_BACKUPSCRIPT"


# RUN RSYNC INTO CURRENT
$RSYNC															\
        -apvz --delete --delete-excluded						\
        --exclude-from="$EXCLUDES"								\
        -e "$SSH -i $KEY"										\
        $PRODUCTION_USER:/										\
		$CURRENT ;


# UPDATE THE MTIME TO REFELCT THE SNAPSHOT TIME
$TOUCH $BACKUPDIR/current


# MAKE HARDLINK COPY
$CP -al $CURRENT/* $NOW


# REMOVE OLD BACKUPS
for FILE in "$( $FIND $OLD -maxdepth 1 -type d -mtime +$DAYS )"
do
#	$RM -Rf $FILE
#   $ECHO $FILE
done
exit 0

Explanations:

#!/bin/bash
unset PATH


# USER VARIABLES
BACKUPDIR=/backup							# Folder on the backup server
KEY=/root/.ssh/id_rsa						# SSH key
MYSQL_BACKUPSCRIPT=/backup/my_backup.sh		# Path to the remote mysql backup script
PRODUCTION_USER=root@production.server.com	# The user and the address of the production server
EXCLUDES=/backup/backup_exclude				# File containing the excluded directories
DAYS=60										# The number of days after which old backups will be deleted


# PATH VARIABLES
SH=/bin/sh									# Location of the bash bin in the production server!!!!

CP=/bin/cp;									# Location of the cp bin
FIND=/usr/bin/find;							# Location of the find bin
ECHO=/bin/echo;								# Location of the echo bin
MK=/bin/mkdir;								# Location of the mk bin
SSH=/usr/bin/ssh;							# Location of the ssh bin
DATE=/bin/date;								# Location of the date bin
RM=/bin/rm;									# Location of the rm bin
GREP=/bin/grep;								# Location of the grep bin
MYSQL=/usr/bin/mysql;						# Location of the mysql bin
MYSQLDUMP=/usr/bin/mysqldump;				# Location of the mysql_dump bin
RSYNC=/usr/bin/rsync;						# Location of the rsync bin
TOUCH=/bin/touch;							# Location of the touch bin

Just set the according variables above. No much explanation needed I think

# CREATING NECESSARY FOLDERS
MK $BACKUPDIR
CURRENT=$BACKUPDIR/current
OLD=$BACKUPDIR/old
MK $CURRENT
MK $OLD
# CREATING CURRENT DATE / TIME
NOW=`$DATE '+%Y-%m'-%d_%H:%M`
NOW=$OLD/$NOW
MK $NOW

This will create the necessary folder.

# CREATE REMOTE MYSQL BACKUP BY RUNNING THE REMOTE BACKUP SCRIPT
$SSH -i $KEY $PRODUCTION_USER "$SH $MYSQL_BACKUPSCRIPT"

This will run a the mysql backup scrip on the production server.

# RUN RSYNC INTO CURRENT
$RSYNC															\
        -apvz --delete --delete-excluded						\
        --exclude-from="$EXCLUDES"								\
        -e "$SSH -i $KEY"										\
        $PRODUCTION_USER:/										\
		$CURRENT ;

This part will get hold of the files on the production server and mirror them into the "current" folder (as defined in the variables on the top of the script).

--delete --delete-excluded

This will delete files and folders in the "current" folder that are not on the productions server anymore.

--exclude-from="$EXCLUDES"
EXCLUDES=/backup/backup_exclude

This will act as exclusion for the backup. I attach here my current content of this file.

/backup/
/bin/
/boot/
/dev/
/lib/
/lost+found/
/mnt/
/opt/
/proc/
/sbin/
/sys/
/tmp/
/usr/
/var/log/
/var/spool/
/var/lib/php4/
/var/lib/mysql/
# REMOVE OLD BACKUPS
for FILE in "$( $FIND $OLD -maxdepth 1 -type d -mtime +$DAYS )"
do
#	$RM -Rf $FILE
#   $ECHO $FILE
done
exit 0

According to the number given in $DAYS it will delete all backups that are older. As you can see the remove command and echo command are both commented out. I advice first to check whether there is a correct echo. Do that by removing the changing the line

'#   $ECHO $FILE'
to this
'   $ECHO $FILE'
Once you are sure it will remove the correct file, comment that line again and uncomment the line with the $RM command.

This part will get hold of the files on the production server and mirror them into the "current" folder (as defined in the variables on the top of the script).

my_backup.sh (mysql backup shell script)

This file needs to be on the remote (production) server that you want to backup from! It will make backups of your mysql databases. The script below will backup each mysql db into a seperate file that can be restored. If you want to backup all databases into one file, you can use mysqldump --all-databases... please refer to the mysql documentation.
However if you want to make a complete backup because you want re-setup your server, then I suggest to do the following: On the production server stop mysql (/etc/init.d/mysqld stop), edit the backup_exclude file and delete the line /var/lib/mysql from it and save it. Then run the backup.sh script. This will result that the actual database files will be copied over... for restoring you just need to copy them back again into /var/lib/mysql (or wherever your mysql databases are stored. The reason why this shouldn't be done while mysql is running is because the backupped files can become corrupted if the file is being backuped AND there is some alteration in the database itself. Never rely on the file backup when mysql is running! Best is to have mysql dumps as here provided by the script AND the actual files when you want to resetup your server!

#!/bin/bash
unset PATH

# USER VARIABLES
MYSQLUSER=root					# The mysql user
MYSQLPWD=*******************	# The mysql user password
MYSQLHOST=localhost				# This should stay localhost
MYSQLBACKUPDIR=/mysql_backup	# A temporary folder where the backupped databases will stay (don't worry, they will be mirrored later

# PATH VARIABLES
MK=/bin/mkdir;								# Location of the mk bin
RM=/bin/rm;									# Location of the rm bin
GREP=/bin/grep;								# Location of the grep bin
MYSQL=/usr/bin/mysql;						# Location of the mysql bin
MYSQLDUMP=/usr/bin/mysqldump;				# Location of the mysql_dump bin


##                                                      ##
##      --       DO NOT EDIT BELOW THIS HERE     --     ##
##                                                      ##


# CREATE MYSQL BACKUP
# Remove existing backup dir - because we backuped the files before onto our backup server, this is safe to do!
$RM -Rf $MYSQLBACKUPDIR
# Create new backup dir
$MK $MYSQLBACKUPDIR
#Dump new files
for i in $(echo 'SHOW DATABASES;' | $MYSQL -u$MYSQLUSER -p$MYSQLPWD -h$MYSQLHOST|$GREP -v '^Database$'); do
  $MYSQLDUMP                                                    \
  -u$MYSQLUSER -p$MYSQLPWD -h$MYSQLHOST                         \
  -Q -c -C --add-drop-table --add-locks --quick --lock-tables   \
  $i > $MYSQLBACKUPDIR/$i.sql;
done;

The last thing now needed is a cron that will do all the backups. You can use something like this:

cron.txt (cron control file)

# Make Backups
0 0,6,12,18 * * * sh /backup/backup.sh

The above would make a backup every 6 hours.

You can add this cron simply by issuing the following command:

crontab cron.txt

Just make sure that you check first that you have no other crons running. If so, just add them to the cron control file. Listing the crons for the current user:

crontab -l

Well, now enjoy the backups.


Please do not use the comment function to ask for help! If you need help, please use our forum.
Comments will be published after administrator approval.
Submitted by Anonymous (not registered) on Mon, 2006-08-21 16:25.
Submitted by Anonymous (not registered) on Mon, 2006-08-14 13:48.

There's an open source backup program which does secure remote backups just as you are trying to do here.

The difference is it only does one job, is designed for it , and does it very well indeed.

Suggest you go here and take a look: http://www.fluffy.co.uk/boxbackup/

Submitted by Anonymous (not registered) on Mon, 2006-08-14 13:12.

I'd have thought it would be best to use mysqldump or equivalent instead of copying the binary files...

 

Submitted by Gibbler (registered user) on Mon, 2006-08-14 18:04.
I use mysqldump...
Submitted by blake (registered user) on Mon, 2007-08-06 22:03.
I personally use script found on this blog to do my mysql backups: http://crazytoon.com/2007/01/23/mysql-backups-using-mysqldump/ It seems to work great for my databases.
Submitted by Anonymous (not registered) on Mon, 2006-08-14 08:22.
Why not rdiff-backup? http://www.nongnu.org/rdiff-backup/
Submitted by Gibbler (registered user) on Mon, 2006-08-14 13:17.
For the simple reason that the rdiff packages for Debian 3.1 Sarge and SuSe 9.1 are incomptible with each other...
Submitted by Anonymous (not registered) on Mon, 2006-08-14 08:21.

Interesting article, but isn't this exactly what the 'dirvish' script does, on top of rsync and ssh?

http://www.dirvish.org/ 

Submitted by Anonymous (not registered) on Mon, 2006-08-14 05:20.
Why don't you just install rsnapshot which does all this for you?
Submitted by Gibbler (registered user) on Mon, 2006-08-14 13:19.
rsnapshot is based on the same idea by Mike Rubel as my approach to it.. however I don't like perl, I never did... and I never will... in my opinion perl causes more problems than it does good things and so I try to avoid it wherever possible.
Submitted by Anonymous (not registered) on Mon, 2006-08-14 01:58.
You could just use RSnapshot (www.rsnapshot.org) that automates all of this for you.
Submitted by Anonymous (not registered) on Mon, 2006-08-14 01:05.

cool article!

Submitted by Anonymous (not registered) on Mon, 2006-08-21 20:17.
I backup roughly 40 servers at remote sites using rsync. At the remote site, there is a rsyncd running, and I pull everything to the data center, deletes and all, everynight, scheduled out of cron. I keep it simple. The rsyncd daemons run on Linux, NT, 2000, 2003. I let a backup program handle the diff's, the Backup Exec client for linux handles it no problem.
Submitted by Anonymous (not registered) on Mon, 2006-08-14 01:04.

It is an extremely bad idea to let your production server connect directly to your backup server without a password, as suggested by this tutorial.

The reason is simple: If someone compromises your production server then your backup server is compromised as well.

 This completely destroys one of the main advantages of having a backup. The superior solution is to allow the backup server to connect to your production server (or multiple production machines for that matter). This backup machine should be running no services and should not be accessible through the internet at all.

Submitted by lucidsystems (not registered) on Tue, 2010-07-27 05:10.

Yes this is a very important point. Further details regarding this concept are available from : http://www.lbackup.org/network_backup_strategies
 

Submitted by Gibbler (registered user) on Mon, 2006-08-14 13:22.

In the article I have shown both approaches... backup machine connection to production machine and vice versa....

By putting the shell scripts on the according server and have them run there through ssh you can alter both scripts to do either one method...

However I agree that the production server should not access the backup server. I think I will alter this a bit and then as possibilty add the other way.

Thanks for pointing that out. 

Submitted by Anonymous (not registered) on Sun, 2006-08-13 23:45.

http://www.nongnu.org/rdiff-backup/

 Much easier to set up that hand made scripts

 

Submitted by Anonymous (not registered) on Sun, 2006-08-13 23:10.
An alternative to all this things is a program called "rdiff-backup". It also uses rdiff and can use ssh to create backups accross different machines. Only one downside is that you need a Python interpreter on your machine, but almost everyone has it nowadays.
Submitted by Anonymous (not registered) on Sun, 2006-08-13 22:49.

I use rdiff-backup to do the same thing more efficiently.

Submitted by Anonymous (not registered) on Sun, 2006-08-13 22:46.

Why not just use something like http://www.nongnu.org/rdiff-backup/ ?

Also, you've hit one of my pet scripting hates; what's the point

things like of  RM=/bin/rm ?

Just set a sensible PATH and be done with it.

 

Submitted by Anonymous (not registered) on Sun, 2006-08-13 21:02.

 Rsnapshot does everything you mentioned above and more.

http://www.rsnapshot.org/

Submitted by Anonymous (not registered) on Wed, 2012-09-19 15:59.

you can also check this free script i wrote to backup with rsync: http://blog.pointsoftware.ch/index.php/howto-local-and-remote-snapshot-backup-using-rsync-with-hard-links/

 

It uses file deduplication (hard-links), MD5 integrity signature, 'chattr' protection, filter rules, disk quota, retention policy with exponential distribution (backups rotation while saving more recent backups than older).

was already used in Disaster Recovery Plans to replicate datacenters, using little network bandwidth with encryption tunnel.

Can be used locally on each servers or via network to central remote backup server. windows server could also be backuped by using a linux box that mount smb shares from them.

Submitted by Anonymous (not registered) on Sun, 2006-08-13 20:18.

refer: http://duplicity.nongnu.org/

A backup tool; opensource; written in python; uses ssh/rsync and optional encryption for backup sets.

Submitted by Anonymous (not registered) on Sun, 2006-08-13 19:52.

rdiff-backup is a bit more advanced version of this.  It does a sync as above, but, after the first sync, it keeps reverse diffs of changes.  Basically, this means, you have both a local copy of the content you are backing up, and, you can also roll back to any backup you've made in the past.

 http://www.nongnu.org/rdiff-backup/

Submitted by Anonymous (not registered) on Sun, 2006-08-13 19:33.

rsnapshot is a nice package that automates all this for you in a simple to use command line tool.  --Nelson