HOWTO – LOCAL AND REMOTE SNAPSHOT BACKUP USING RSYNC WITH HARD LINKS

Introduction

Your Linux servers are running smoothly? Fine.
Now if something incidentally gets wrong you need also to prepare an emergency plan.
And even when everything is going fine, a backup that is directly stored on the same server can still be useful:

  • for example when you need to see what was changed on a specific file,
  • or if you want to find the list of files that were installed or modified after the installation of an application

This article is then intended to show you how you can set up an open-source solution, using the magic of the famous ‘rsync’ tool and some shell-scripting, to deploy a backup system without the need of investing into expensive proprietary software.
Another advantage of a shell-script is that you can easily adapt it according to your specific needs, for example for you DRP architecture.

The proposed shell-script is derivate from the great work of Mikes Handy’s rotating-filesystem-snapshot utility (cf. http://www.mikerubel.org/computers/rsync_snapshots).
It creates backups of your full filesystem (snapshots) with the combined advantages of full and incremental backups:

  • It uses as little disk space as an incremental backup because all unchanged files are hard linked with existing files from previous backups; only the modified files require new inodes.
  • Because of the transparency of hard links, all backups are directly available and always online for read access with your usual programs; there is no need to extract the files from a full archive and also no complicate replay of incremental archives is necessary.

It is capable of doing local (self) backups and it can also be run from a remote backup server to centralize all backups to a safe place and therefore avoid correlated physical risks.
‘rsync’ features tremendous optimizations of bandwidth usage and transfers only the portions of a file that were changed thanks to its brilliant algorithms, created by Andrew Tridgell. (cf. http://bryanpendleton.blogspot.ch/2010/05/rsync-algorithm.html)
‘rsync’ is also using network encryption via ‘ssh’.

The script let you achieve:

  • local or remote backups with extremely low bandwidth requirement
  • file level deduplication between backups using hard links (also across servers on the remote backup server)
  • specify a bandwidth limit to moderate the network and I/O load on production servers
  • backup retention policy:
    • per server disk quota restrictions: for example never exceed 50GB and always keep 100GB of free disk
    • rotation of backups with non-linear distribution, with the idea that recent backups are more useful than older, but that sometimes you still need a very old backup
  • filter rules to include or exclude specific patterns of folders and files
  • integrity protection, the backups have a ‘chattr’ read-only protection and a MD5 integrity signature can also be calculated incrementally

Installation

The snapshot backups are saved into the ‘/backup’ folder.
You can also create a symbolic link to point to another partition with more disk space, for example:

ln -sv mnt/bigdisk /backup

Then create the folders:

mkdir -pv /backup/snapshot/{$(hostname -s),rsync,md5-log}
[ -h /backup/snapshot/localhost ] || ln -vs $(hostname -s) /backup/snapshot/localhost

Now create the shell-script ‘/backup/snapshot/rsync/rsync-snapshot.sh’ (download rsync-snapshot.sh):

#!/bin/bash
# ----------------------------------------------------------------------
# created by francois scheurer on 20070323
# derivate from mikes handy rotating-filesystem-snapshot utility
# see http://www.mikerubel.org/computers/rsync_snapshots
# ----------------------------------------------------------------------
#rsync note:
#    1) rsync -avz /src/foo  /dest      => ok, creates /dest/foo, like cp -a /src/foo /dest
#    2) rsync -avz /src/foo/ /dest/foo  => ok, creates /dest/foo, like cp -a /src/foo/. /dest/foo (or like cp -a /src/foo /dest)
#    3) rsync -avz /src/foo/ /dest/foo/ => ok, same as 2)
#    4) rsync -avz /src/foo/ /dest      => dangerous!!! overwrite dest content, like cp -a /src/foo/. /dest
#      solution: remove trailing / at /src/foo/ => 1)
#      minor problem: rsync -avz /src/foo /dest/foo => creates /dest/foo/foo, like mkdir /dest/foo && cp -a /src/foo /dest/foo
#    main options:
#      -H --hard-links
#      -x --one-file-system
#      -a equals -rlptgoD (no -H,-A,-X)
#        -r --recursive
#        -l --links
#        -p --perms
#        -t --times
#        -g --group
#        -o --owner
#        -D --devices --specials
#    useful options:
#      -S --sparse
#      -n --dry-run
#      -I --ignore-times
#      -c --checksum
#      -z --compress
#      -bwlimit=X limit disk IO to X kB/s
#    other options:
#      -v --verbose
#      -y --fuzzy
#      --stats
#      -h --human-readable
#      --progress
#      -i --itemize-changes
#    quickcheck options:
#      the default behavior is to skip files with same size & mtime on destination
#      mtime = last data write access
#      atime = last data read access (can be ignored with noatime mount option or with chattr +A)
#      ctime = last inode change (write access, change of permission or ownership)
#      note that a checksum is always done after a file synchronization/transfer
#      --modify-window=X ignore mtime differences less or equal to X sec
#      --size-only skip files with same size on destination (ignore mtime)
#      -c --checksum skip files with same MD5 checksum on destination (ignore size & mtime, all files are read once, then the list of files to be resynchronized is read a second time, there is a lot of disk IO but network trafic is minimal if many files are identical, log includes only different files)
#      -I --ignore-times never skip files (all files are resynchronized, all files are read once, there is more network trafic than with --checksum but less disk IO, log includes all files)
#      --link-dest does the quickcheck on another reference-directory and makes hardlinks if quickcheck succeeds
#        (however, if mtime is different and --perms is used, the reference file is copied in a new inode)
#    see also this link for a rsync tutorial: http://www.thegeekstuff.com/2010/09/rsync-command-examples/



# ------------- the help page ------------------------------------------
if [ "$1" == "-h" ] || [ "$1" == "--help" ]
then
  cat << "EOF"
Version 2.00 2012-08-31

USAGE: rsync-snapshot.sh HOST [--recheck]

PURPOSE: create a snapshot backup of the whole filesystem into the folder
  '/backup/snapshot/HOST/snapshot.001'.
  If HOST is 'localhost' it is replaced with the local hostname.
  If HOST is a remote host then rsync over ssh is used to transfer the files
  with a delta-transfer algorithm to transfer only minimal parts of the files
  and improve speed; rsync uses for this the previous backup as reference.
  This reference is also used to create hard links instead of files when
  possible and thus save disk space. If original and reference file have
  identical content but different timestamps or permissions then no hard link
  is created.
  A rotation of all backups renames snapshot.X into snapshot.X+1 and removes
  backups with X>512. About 10 backups with non-linear distribution are kept
  in rotation; for example with X=1,2,3,4,8,16,32,64,128,256,512.
  The snapshots folders are protected read-only against all users including
  root using 'chattr'.
  The --recheck option forces a sync of all files even if they have same mtime
  & size; it is can verify a backup and fix corrupted files;
  --recheck recalculates also the MD5 integrity signatures without using the
  last signature-file as precalculation.
  Some features like filter rules, MD5, chattr, bwlimit and per server retention
  policy can be configured by modifying the scripts directly.

FILES:
    /backup/snapshot/rsync/rsync-snapshot.sh  the backup script
    /backup/snapshot/rsync/rsync-list.sh      the md5 signature script
    /backup/snapshot/rsync/rsync-include.txt  the filter rules

Examples:
  (nice -5 ./rsync-snapshot.sh >log &) ; tail -f log
  cd /backup/snapshot; for i in $(ls -A); do nice -10 /backup/snapshot/rsync/rsync-snapshot.sh $i; done
EOF
  exit 1
fi




# ------------- tuning options, file locations and constants -----------
SRC="$1" #name of backup source, may be a remote or local hostname
OPT="$2" #options (--recheck)
HOST_PORT=22 #port of source of backup
SCRIPT_PATH="/backup/snapshot/rsync"
SNAPSHOT_DST="/backup/snapshot" #destination folder
NAME="snapshot" #backup name
LOG="rsync.log"
MIN_MIBSIZE=5000 # older snapshots (except snapshot.001) are removed if free disk <= MIN_MIBSIZE. the script may exit without performing a backup if free disk is still short.
OVERWRITE_LAST=0 # if free disk space is too small, then this option let us remove snapshot.001 as well and retry once
MAX_MIBSIZE=20000 # older snapshots (except snapshot.001) are removed if their size >= MAX_MIBSIZE. the script performs a backup even if their size is too big.
#old: SPEED=5 # 1 is slow, 100 is fast, 100000 faster and 0 does not use slow-down. this allows to avoid rsync consuming too much system performance
BWLIMIT=100000 # bandwidth limit in KiB/s. 0 does not use slow-down. this allows to avoid rsync consuming too much system performance
BACKUPSERVER="rembk" # this server connects to all other to download filesystems and create remote snapshot backups
MD5LIST=0 #to compute a list of md5 integrity signatures of all backuped files, need 'rsync-list.sh'
CHATTR=1 # to use 'chattr' command and protect the backups again modification and deletion
DU=1 # to use 'du' command and calculate the size of existing backups, disable it if you have many backups and it is getting too slow (for example on BACKUPSERVER)
SOURCE="/" #source folder to backup

HOST_LOCAL="$(hostname -s)" #local hostname
#HOST_SRC="${SRC:-${HOST_LOCAL}}" #explicit source hostname, default is local hostname
if [ -z "${SRC}" ] || [ "${SRC}" == "localhost" ]
then
  HOST_SRC="${HOST_LOCAL}" #explicit source hostname, default is local hostname
else
  HOST_SRC="${SRC}" #explicit source hostname
fi

if [ "${HOST_LOCAL}" == "${BACKUPSERVER}" ] #if we are on BACKUPSERVER then do some fine tuning
then
  MD5LIST=1
  MIN_MIBSIZE=35000 #needed free space for chunk-file tape-arch.sh
  MAX_MIBSIZE=12000
  DU=0 # NB: 'du' is currently disabled on BACKUPSERVER for performance reasons
elif [ "${HOST_LOCAL}" == "${HOST_SRC}" ] #else if we are on a generic server then do other some fine tuning
then
  if [ "${HOST_SRC}" == "ZRHSV-TST01" ]; then MIN_MIBSIZE=500; CHATTR=0; DU=0; MD5LIST=0; fi
fi




# ------------- initialization -----------------------------------------
shopt -s extglob                                            #enable extended pattern matching operators

OPTION="--stats
  --recursive
  --links
  --perms
  --times
  --group
  --owner
  --devices
  --hard-links
  --numeric-ids
  --delete
  --delete-excluded
  --bwlimit=${BWLIMIT}"
#  --progress
#  --size-only
#  --stop-at
#  --time-limit
#  --sparse
if [ "${HOST_SRC}" != "${HOST_LOCAL}" ] #option for a remote server
then
  SOURCE="${HOST_SRC}:${SOURCE}"
  OPTION="${OPTION}
  --compress
  --rsh="ssh -p ${HOST_PORT} -i /root/.ssh/rsync_rsa -l root"
  --rsync-path="/usr/bin/rsync""
fi
if [ "${OPT}" == "--recheck" ]
then
  OPTION="${OPTION}
  --ignore-times"
elif [ -n "${OPT}" ]
then
  echo "Try rsync-snapshot.sh --help ."
  exit 2
fi




# ------------- check conditions ---------------------------------------
echo "$(date +%Y-%m-%d_%H:%M:%S) ${HOST_SRC}: === Snapshot backup is created into ${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.001 ==="
STARTDATE=$(date +%s)

# make sure we're running as root
if (($(id -u) != 0))
then
  echo "Sorry, must be root. Exiting..."
  echo "$(date +%Y-%m-%d_%H:%M:%S) ${HOST_SRC}: === Snapshot failed. ==="
  exit 2
fi

# make sure we have a correct snapshot folder
if [ ! -d "${SNAPSHOT_DST}/${HOST_SRC}" ]
then
  echo "Sorry, folder ${SNAPSHOT_DST}/${HOST_SRC} is missing. Exiting..."
  echo "$(date +%Y-%m-%d_%H:%M:%S) ${HOST_SRC}: === Snapshot failed. ==="
  exit 2
fi

# make sure we do not have started already rsync-snapshot.sh or rsync process (started by rsync-cp.sh or by a remote rsync-snapshot.sh) in the background.
if [ "${HOST_LOCAL}" != "${BACKUPSERVER}" ] #because BACKUPSERVER need sometimes to perform an rsync-cp.sh it must disable the check of "already started".
then
  RSYNCPID=$(pgrep -f "/bin/bash .*rsync-snapshot.sh")
  if ([ -n "${RSYNCPID}" ] && [ "${RSYNCPID}" != "$$" ]) || pgrep -x "rsync"
  then
    echo "Sorry, rsync is already running in the background. Exiting..."
    echo "$(date +%Y-%m-%d_%H:%M:%S) ${HOST_SRC}: === Snapshot failed. ==="
    exit 2
  fi
fi




# ------------- remove some old backups --------------------------------
# remove certain snapshots to achieve an exponential distribution in time of the backups (1,2,4,8,...)
for b in 512 256 128 64 32 16 8 4
do
  let a=b/2+1
  let f=0 #this flag is set to 1 when we find the 1st snapshot in the range b..a
  for i in $(eval echo $(printf "{%.3d..%.3d}" "${b}" "${a}"))
  do
    if [ -d "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${i}" ]
    then
      if [ "${f}" -eq 0 ]
      then
        let f=1
      else
        echo "Removing ${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${i} ..."
        [ "${CHATTR}" -eq 1 ] && chattr -fR -i "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${i}"
        rm -rf "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${i}"
      fi
    fi
  done
done

# remove additional backups if free disk space is short
remove_snapshot() {
  local MIN_MIBSIZE2=$1
  local MAX_MIBSIZE2=$2
  for i in {512..001}
  do
    if [ -d "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${i}" ] || [ ${i} -eq 1 ]
    then
      let d=0 #disk space used by snapshots and free disk space are ok
      echo -n "$(date +%Y-%m-%d_%H:%M:%S) Checking free disk space... "
      FREEDISK=$(df -B M ${SNAPSHOT_DST} | tail -1 | sed -e 's/  */ /g' | cut -d" " -f4 | sed -e 's/M*//g')
      echo -n "${FREEDISK} MiB free. "
      if [ ${FREEDISK} -ge ${MIN_MIBSIZE2} ]
      then
        echo "Ok, bigger than ${MIN_MIBSIZE2} MiB."
        if [ "${DU}" -eq 0 ]
        then #avoid slow 'du'
          break
        else
          echo -n "$(date +%Y-%m-%d_%H:%M:%S) Checking disk space used by ${SNAPSHOT_DST}/${HOST_SRC} ... "
          USEDDISK=$(du -B 1048576 -s "${SNAPSHOT_DST}/${HOST_SRC}/" | cut -f1)
          echo -n "${USEDDISK} MiB used. "
          if [ ${USEDDISK} -le ${MAX_MIBSIZE2} ]
          then
            echo "Ok, smaller than ${MAX_MIBSIZE2} MiB."
            break
          else
            let d=2 #disk space used by snapshots is too big
          fi
        fi
      else
        let d=1 #free disk space is too small
      fi
      if [ ${d} -ne 0 ] #we need to remove snapshots
      then
        if [ ${i} -ne 1 ]
        then
          echo "Removing ${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${i} ..."
          [ "${CHATTR}" -eq 1 ] && chattr -fR -i "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${i}"
          rm -rf "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${i}"
        else #all snapshots except snapshot.001 are removed
          if [ ${d} -eq 1 ] #snapshot.001 causes that free space is too small
          then
            if [ "${OVERWRITE_LAST}" -eq 1 ] #last chance: remove snapshot.001 and retry once
            then
              OVERWRITE_LAST=0
              echo "Warning, free disk space will be smaller than ${MIN_MIBSIZE} MiB."
              echo "$(date +%Y-%m-%d_%H:%M:%S) OVERWRITE_LAST enabled. Removing ${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.001 ..."
              rm -rf "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.001"
            else
              for j in ${LNKDST//--link-dest=/}
              do
                if [ -d "${j}" ] && [ "${CHATTR}" -eq 1 ] && [ $(lsattr -d "${j}" | cut -b5) != "i" ]
                then
                  chattr -fR +i "${j}" #undo unprotection that was needed to use hardlinks
                fi
              done
              echo "Sorry, free disk space will be smaller than ${MIN_MIBSIZE} MiB. Exiting..."
              echo "$(date +%Y-%m-%d_%H:%M:%S) ${HOST_SRC}: === Snapshot failed. ==="
              exit 2
            fi
          elif [ ${d} -eq 2 ] #snapshot.001 causes that disk space used by snapshots is too big
          then
            echo "Warning, disk space used by ${SNAPSHOT_DST}/${HOST_SRC} will be bigger than ${MAX_MIBSIZE} MiB. Continuing anyway..."
          fi
        fi
      fi
    fi
  done
}

# perform an estimation of required disk space for the new backup
while : #this loop is executed a 2nd time if OVERWRITE_LAST was ==1 and snapshot.001 got removed
do
  OOVERWRITE_LAST="${OVERWRITE_LAST}"
  echo -n "$(date +%Y-%m-%d_%H:%M:%S) Testing needed free disk space ..."
  mkdir -p "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.test-free-disk-space"
  chmod -R 775 "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.test-free-disk-space"
  cat /dev/null >"${SNAPSHOT_DST}/${HOST_SRC}/${LOG}"
  LNKDST=$(find  "${SNAPSHOT_DST}/" -maxdepth 2 -type d -name "${NAME}.001" -printf " --link-dest=%p")
  for i in ${LNKDST//--link-dest=/}
  do
    if [ -d "${i}" ] && [ "${CHATTR}" -eq 1 ] && [ $(lsattr -d "${i}" | cut -b5) == "i" ]
    then
      chattr -fR -i "${i}" #unprotect last snapshots to use hardlinks
    fi
  done
  eval rsync
    --dry-run
    ${OPTION}
    --include-from="${SCRIPT_PATH}/rsync-include.txt"
    ${LNKDST}
    "${SOURCE}" "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.test-free-disk-space" >>"${SNAPSHOT_DST}/${HOST_SRC}/${LOG}"
  RES=$?
  if [ "${RES}" -ne 0 ] && [ "${RES}" -ne 23 ] && [ "${RES}" -ne 24 ]
  then
    echo "Sorry, error in rsync execution (value ${RES}). Exiting..."
    echo "$(date +%Y-%m-%d_%H:%M:%S) ${HOST_SRC}: === Snapshot failed. ==="
    exit 2
  fi
  let i=$(tail -100 "${SNAPSHOT_DST}/${HOST_SRC}/${LOG}" | grep 'Total transferred file size:' | cut -d " " -f5)/1048576
  echo " ${i} MiB needed."
  rm -rf "${SNAPSHOT_DST}/${HOST_SRC}/${LOG}" "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.test-free-disk-space"
  remove_snapshot $((${MIN_MIBSIZE} + ${i})) $((${MAX_MIBSIZE} - ${i}))
  if [ "${OOVERWRITE_LAST}" == "${OVERWRITE_LAST}" ] #no need to retry
  then
    break
  fi
done




# ------------- create the snapshot backup -----------------------------
# perform the filesystem backup using rsync and hard-links to the latest snapshot
# Note:
#   -rsync behaves like cp --remove-destination by default, so the destination
#    is unlinked first.  If it were not so, this would copy over the other
#    snapshot(s) too!
#   -use --link-dest to hard-link when possible with previous snapshot,
#    timestamps, permissions and ownerships are preserved
echo "$(date +%Y-%m-%d_%H:%M:%S) Creating folder ${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.000 ..."
mkdir -p "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.000"
chmod 775 "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.000"
cat /dev/null >"${SNAPSHOT_DST}/${HOST_SRC}/${LOG}"
echo -n "$(date +%Y-%m-%d_%H:%M:%S) Creating backup of ${HOST_SRC} into ${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.000"
if [ -n "${LNKDST}" ]
then
  echo " hardlinked with ${LNKDST//--link-dest=/} ..."
else
  echo " not hardlinked ..."
fi
eval rsync
  -vv
  ${OPTION}
  --include-from="${SCRIPT_PATH}/rsync-include.txt"
  ${LNKDST}
  "${SOURCE}" "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.000" >>"${SNAPSHOT_DST}/${HOST_SRC}/${LOG}"
RES=$?
if [ "${RES}" -ne 0 ] && [ "${RES}" -ne 23 ] && [ "${RES}" -ne 24 ]
then
  echo "Sorry, error in rsync execution (value ${RES}). Exiting..."
  echo "$(date +%Y-%m-%d_%H:%M:%S) ${HOST_SRC}: === Snapshot failed. ==="
  exit 2
fi
for i in ${LNKDST//--link-dest=/}
do
  if [ -d "${i}" ] && [ "${CHATTR}" -eq 1 ] && [ $(lsattr -d "${i}" | cut -b5) != "i" ]
  then
    chattr -fR +i "${i}" #undo unprotection that was needed to use hardlinks
  fi
done
mv "${SNAPSHOT_DST}/${HOST_SRC}/${LOG}" "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.000/${LOG}"
gzip -f "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.000/${LOG}"




# ------------- create the MD5 integrity signature ---------------------
# create a gziped 'find'-list of all snapshot files (including md5 signatures)
if [ "${MD5LIST}" -eq 1 ]
then
  echo "$(date +%Y-%m-%d_%H:%M:%S) Computing filelist with md5 signatures of ${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.000 ..."
  OWD="$(pwd)"
  cd "${SNAPSHOT_DST}"
#  NOW=$(date "+%s")
#  MYTZ=$(date "+%z")
#  let NOW${MYTZ:0:1}=3600*${MYTZ:1:2}+60*${MYTZ:3:2} # convert localtime to UTC
#  DATESTR=$(date -d "1970-01-01 $((${NOW} - 1)) sec" "+%Y-%m-%d_%H:%M:%S") # 'now - 1s' to avoid missing files
  DATESTR=$(date -d "1970-01-01 UTC $(($(date +%s) - 1)) seconds" "+%Y-%m-%d_%H:%M:%S") # 'now - 1s' to avoid missing files
  REF_LIST="$(find ${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.001/ -maxdepth 1 -type f -name 'snapshot.*.list.gz' 2>/dev/null)"
  if [ -n "${REF_LIST}" ] && [ "${OPT}" != "--recheck" ]
  then
    REF_LIST2="/tmp/rsync-reflist.tmp"
    gzip -dc "${REF_LIST}" >"${REF_LIST2}"
    touch -r "${REF_LIST}" "${REF_LIST2}"
    ${SCRIPT_PATH}/rsync-list.sh "${HOST_SRC}/${NAME}.000" 0 "${REF_LIST2}" | sort -u | gzip -c >"${HOST_SRC}/${NAME}.${DATESTR}.list.gz"
    rm "${REF_LIST2}"
  else
    ${SCRIPT_PATH}/rsync-list.sh "${HOST_SRC}/${NAME}.000" 0 | sort -u | gzip -c >"${HOST_SRC}/${NAME}.${DATESTR}.list.gz"
  fi
  touch -d "${DATESTR/_/ }" "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${DATESTR}.list.gz"
  cd "${OWD}"
  [ ! -d "${SNAPSHOT_DST}/${HOST_SRC}/md5-log" ] && mkdir -p "${SNAPSHOT_DST}/${HOST_SRC}/md5-log"
  cp -al "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${DATESTR}.list.gz" "${SNAPSHOT_DST}/${HOST_SRC}/md5-log/${NAME}.${DATESTR}.list.gz"
  mv "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${DATESTR}.list.gz" "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.000/${NAME}.${DATESTR}.list.gz"
  touch -d "${DATESTR/_/ }" "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.000"
fi




# ------------- finish and clean up ------------------------------------
# protect the backup against modification with chattr +immutable
if [ "${CHATTR}" -eq 1 ]
then
  echo "$(date +%Y-%m-%d_%H:%M:%S) Setting recursively immutable flag of ${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.000 ..."
  chattr -fR +i "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.000"
fi

# rotate the backups
if [ -d "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.512" ] #remove snapshot.512
then
  echo "Removing ${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.512 ..."
  [ "${CHATTR}" -eq 1 ] && chattr -fR -i "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.512"
  rm -rf "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.512"
fi
[ -h "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.last" ] && rm -f "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.last"
for i in {511..000}
do
  if [ -d "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${i}" ]
  then
    let j=${i##+(0)}+1
    j=$(printf "%.3d" "${j}")
    echo "Renaming ${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${i} into ${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${j} ..."
    [ "${CHATTR}" -eq 1 ] && chattr -i "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${i}"
    mv "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${i}" "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${j}"
    [ "${CHATTR}" -eq 1 ] && chattr +i "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.${j}"
    [ ! -h "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.last" ] && ln -s "${NAME}.${j}" "${SNAPSHOT_DST}/${HOST_SRC}/${NAME}.last"
  fi
done

# remove additional backups if free disk space is short
OVERWRITE_LAST=0 #next call of remove_snapshot() will not remove snapshot.001
remove_snapshot ${MIN_MIBSIZE} ${MAX_MIBSIZE}
echo "$(date +%Y-%m-%d_%H:%M:%S) ${HOST_SRC}: === Snapshot backup successfully done in $(($(date +%s) - ${STARTDATE})) sec. ==="
exit 0
#eof

Then create the file ‘/backup/snapshot/rsync/rsync-include.txt’ (download rsync-include) that contains the include and exclude patterns:

#created by francois scheurer on 20120828
#
#note:
#  -be careful with trailing spaces, '- * ' is different from '- *'
#  -rsync stops at first matched rule and ignore the rest
#  -rsync descends iteratively the folders-arborescence
#  -'**' matches also zero, one or several '/'
#  -get the list of all root files/folders
#     pdsh -f 1 -w server[1-22] 'ls -la / | sed -e "s/  */ /g" | cut -d" " -f9-' | cut -d" " -f2- | sort -u
#  -include all folders with '+ */' (missing this rule implies that '- *' will override all the inclusions of any subfolders)
#  -exclude all non explicited files with '- *'
#  -exclude everything except /etc/ssh: '+ /etc/ssh/**  + */  - *'
#  -exclude content of /tmp but include foldername: '- /tmp/*  + */'
#  -exclude content and also foldername /tmp: '- /tmp/  + */'
#  -exclude content of each .ssh but include foldername: '- /**/.ssh/*  + */'
#
#include everything except /tmp/:
#- /tmp/
#same but include /tmp/ as an empty folder:
#- /tmp/*
#include only /var/www/:
#+ /var/
#+ /var/www/
#+ /var/www/**
#- *
#same but also include folder structure:
#+ /var/www/**
#+ */
#- *




#pattern list for / (include by default):
+ /

- /lost+found/*
- /*.bak*
- /*.old*
#- /backup/*
#- /boot/*
#- /etc/ssh/ssh_host*
#- /home/*
- /media/*
- /mnt/*/*
#- /opt/*
- /opt/fedora*/data/*
- /opt/fedora*/lucene/*
- /opt/fedora*/tomcat*/logs/*
- /opt/fedora*/tomcat*/temp/*
- /opt/fedora*/tomcat*/work/*
- /postgresql/*/main/pg_log/*
- /postgresql/*/main/pg_xlog/*
- /postgresql/*/main/postmaster.opts
- /postgresql/*/main/postmaster.pid
#- /postgresql/*/main/*/*
- /proc/*
- /root/old/*
#- /root/.bash_history
- /root/.mc/*
#- /root/.ssh/*openssh*
- /root/.viminfo
- /root/tmp/*
#- /srv/*
- /sys/*
- /tmp/*
#- /usr/local/franz/logstat/logstat.log
- /var/cache/*
- /var/lib/mysql/*
- /var/lib/postgresql/*/main/*/*
- /var/log/*
#- /var/spool/*
- /var/tmp/*

#pattern list for /backup/ (exclude by default):
+ /backup/
- /backup/lost+found/*
- /backup/*.bak*
- /backup/*.old*
+ /backup/snapshot/
+ /backup/snapshot/rsync/
+ /backup/snapshot/rsync/**
- /backup/snapshot/*
- /backup/*
+ /mnt/
+ /mnt/*/
+ /mnt/*/backup/
+ /mnt/*/backup/snapshot/
+ /mnt/*/backup/snapshot/rsync/
+ /mnt/*/backup/snapshot/rsync/**
- /mnt/*/backup/snapshot/
- /mnt/*/backup/

#pattern list for /boot/ (include by default):
+ /boot/
- /boot/lost+found/*
- /boot/*.bak*
- /boot/*.old*
+ /boot/**

#pattern list for /home/ (include by default):
+ /home/
- /home/lost+found/*
- /home/*.bak*
- /home/*.old*
- /home/xen/*
+ /home/**

#include folder structure by default:
#+ */
#include everything by default:
+ *
#exclude everything by default:
#- *
#eof

And finally create the optional shell-script ‘/backup/snapshot/rsync/rsync-list.sh’ (download rsync-list.sh) that calculates the MD5 integrity signatures:

#!/bin/bash
# created by francois scheurer on 20081109
# this script is used by rsync-snapshot.sh,
# it recursively prints to stdout the filelist of folder $1 and computes md5 signatures
# it deals correctly with special filenames with newlines or ''
# note1: the script assumes that a file is unchanged if its size and ctime are unchanged;
#   this assumption has a very small risk of being wrong:
#   it could be wrong if two files with different contents but same filename and size are created in the same second in two directories;
#   if the first directory is then removed and the second is renamed as the first, the file is not detected as changed.
# note2: ctime checking can be replaced by mtime checking if CTIME_CHK=0;
#   this is needed by rsync-snapshot.sh (because of hard links creation that do not preserve ctime).




# ------------- the help page ---------------------------------------------
if [ "$1" == "-h" ] || [ "$1" == "--help" ]
then
  cat << "EOF"
Version 1.6 2009-06-19

USAGE: rsync-list.sh PATH/DIR CTIME_CHK [REF_LIST]

PURPOSE: recursively prints to stdout the filelist of folder PATH/DIR and computes md5 integrity signatures.
  It deals correctly with special filenames with newlines or ''.
  If a ref_list is provided, it is used to avoid the re-calculation of md5 on files
  with unchanged filename and ctime.
  A ref_list is a file containing the output of a previously execution of this shell-script.
  The script assumes that a file is unchanged if its size and ctime are unchanged.
  The ref_list_mtime is used to force a md5 re-calculation of all files with newer ctime:
  -if file_ctime > ref_list_mtime then re-calculate md5
  -if file_ctime = ref_file_ctime then use ref_list
  CTIME_CHK can be 1 to base the algorithm on ctime or 0 to base it on mtime.

NOTE: the script assumes that all processes avoid all file modifications in PATH/DIR during the script's execution,
  you should read following remarks if this assumption cannot be guaranted:
  -a recent ref_list_mtime (>= date_of_first_write_to_ref_list) causes the script
   to miss all files with: ref_list_mtime >= file_ctime > ref_file_ctime
   solution: 'touch' ref_file_mtime with date_of_first_write_to_ref_list - 1 second
  -an old ref_list_mtime (< date_of_last_write_to_ref_list) causes the script
   to double all files with: ref_list_mtime < file_ctime = ref_file_ctime
   solution: pipe the output to 'sort -u'

EXAMPLE:
  DATESTR=$( date -d "1970-01-01 UTC $(( $( date +%s ) - 1 )) seconds" "+%Y-%m-%d_%H:%M:%S" ) # 'now - 1s' to avoid missing files
  REF_LIST="/etc.2008-11-23_10:00:00.list.gz"
  REF_LIST2="/tmp/rsync-reflist.tmp"
  gzip -dc "${REF_LIST}" >"${REF_LIST2}"
  touch -r "${REF_LIST}" "${REF_LIST2}"
  ./rsync-list.sh "/etc/" 1 "${REF_LIST2}" | sort -u | gzip -c >"/etc.${DATESTR}.list.gz" # 'sort -u' to avoid doubling files
  rm "${REF_LIST2}"
  touch -d "${DATESTR/_/ }" "/etc.${DATESTR}.list.gz"
EOF
  exit 1
elif [ $# -ne 2 ] && [ $# -ne 3 ]
then
  echo "Sorry, you must provide 2 or 3 arguments. Exiting..."
  exit 2
fi




# ------------- file locations and constants ---------------------------
SRC="$1" #name of source of backup, remote or local hostname
CTIME_CHK=$2 #1 for ctime checking, 0 for mtime checking
if [ "$CTIME_CHK" -eq 1 ]
then
  CTIME_STAT="%z"
  CTIME_FIND="-cnewer"
else
  CTIME_STAT="%y"
  CTIME_FIND="-newer"
fi
REF="$3" #filename of optional reference list
SCRIPT_PATH="/backup/snapshot/rsync"
FINDSCRIPT="$SCRIPT_PATH/rsync-find.sh.tmp" # temporary shell-script to calculate filelist




# ------------- using reference list to to reduce md5 calculation time -
if [ -n "$REF" ] #we have a previous md5 list
then

  if ! [ -s "$REF" ] #invalid reference list
  then
     echo "Error: $REF is not a valid reference list. Exiting..."
     exit 2
  fi

  touch /tmp/testsystime.tmp
  if ! [ /tmp/testsystime.tmp -nt "$REF" ] #if system time is incorrect then exit
  then
    echo "Error: system time is older than mtime of $REF. Exiting..."
    rm /tmp/testsystime.tmp
    exit 2
  fi
  rm /tmp/testsystime.tmp

  cat "$REF" | while read -r LINE #consider all previous files that still exist now with same ctime and size and print their already calculated md5
  do
    SIZE_AND_CTIME="${LINE#* md5sum=* * * * }" #extract size and ctime from reference list
    SIZE_AND_CTIME="${SIZE_AND_CTIME% `*}"
    LINE2="${LINE%% md5sum=*}"    #1) keep only the filename part of the line
    LINE2="${LINE2//n/
}"                                #2) replace 'n' with newline, the problem now is that 'n' is replaced, too (following is not a solution because it removes previous char LINE2="${LINE2//[^]n/newline}")
    LINE2="${LINE2//
/n}"                          #3) replace ''+newline with 'n', fixing the problem of 2)
    LINE2="${LINE2///}" #4) replace '' with ''
    if [ -a "$LINE2" ] || [ -h "$LINE2" ] #check if file still exists
    then
      SIZE_AND_CTIME2=$( stat -c"%s $CTIME_STAT" "$LINE2" )
      SIZE_AND_CTIME2="${SIZE_AND_CTIME2#* md5sum=* * * * }" #get size and ctime from current file
      SIZE_AND_CTIME2="${SIZE_AND_CTIME2% `*}"
      if [ "$SIZE_AND_CTIME" == "$SIZE_AND_CTIME2" ] #current file unchanged (see above note), so print the already calculated md5
      then
        echo "$LINE"
      elif [ "${SIZE_AND_CTIME#* }" == "${SIZE_AND_CTIME2#* }" ] #size is different but ctime is same: update current file's ctime to force md5's recalculation (see below)
      then
        if [ "$CTIME_CHK" -eq 1 ]
        then
          chmod --reference="$LINE2" "$LINE2" #update ctime (note: system time is assumed to be correct)
        else
          touch -m "$LINE2" #update mtime (note: system time is assumed to be correct)
        fi
      fi
    fi #else the file has been either deleted or modified (different ctime) and reference list is here useless
  done
  CNEWER_REF="$CTIME_FIND $REF" #prepare 'find' -cnewer option
else
  CNEWER_REF=""
fi




# ------------- calculation of md5 sums --------------------------------
#this 1st method is not slow but fails on filenames with newlines or ''
find "${SRC}" $CNEWER_REF ! ( -path "*
*" -o -path "**" -o -path " *" -o -path "* " ) | while read LINE
do
  LINE2="$LINE"
  if ! [ -h "$LINE" ] && [ -f "$LINE" ]
  then
    RES=$( md5sum "$LINE" )
    LINE2="$LINE2 md5sum=${RES%% *}"
  else
    LINE2="$LINE2 md5sum=-"
  fi
  RES=$( echo $( stat -c"%A %U %G %s $CTIME_STAT `%F'" "$LINE" ) )
  echo -E "$LINE2 $RES"
done
#this 2nd method is slow but works on filenames with newlines or ''
( cat << "EOF"
#!/bin/bash
  LINE="$1"
#  LINE2="${LINE///}" # replace  with
  LINE2="${LINE///}" # replace  with
  LINE2="${LINE2//
/n}" # replace newline with n
  if ! [ -h "$LINE" ] && [ -f "$LINE" ]
  then
    RES=$( md5sum "$LINE" )
    LINE2="$LINE2 md5sum=${RES%% *}"
  else
    LINE2="$LINE2 md5sum=-"
  fi
  RES=$( echo $( stat -c"%A %U %G %s $CTIME_STAT `%F'" "$LINE" ) )
  echo -E "$LINE2 $RES"
EOF
) >"$FINDSCRIPT"
chmod +x "$FINDSCRIPT"
find "${SRC}" $CNEWER_REF ( -path "*
*" -o -path "**" -o -path " *" -o -path "* " ) -print0 | xargs --replace --null "$FINDSCRIPT" "{}"
rm "$FINDSCRIPT"
#eof

Set the ownerships and permissions:

chown -cR root:root /backup/snapshot/rsync/
chmod 700 /backup/snapshot/rsync/rsync-*.sh
chmod 600 /backup/snapshot/rsync/rsync-include.txt

Usage

When you call the script ‘rsync-snapshot.sh’ without parameters or with the hostname of the server itself (or localhost), the script performs a self-snapshot of the complete filesystem ‘/’.
You can and should use filter rules to exclude things like ‘/proc/*’ and ‘/sys/*’. For this you need to edit the configuration file ‘/backup/snapshot/rsync/rsync-include.txt’.
A description of the filter rules syntax is written as comments in the file itself.

The snapshot backup is created into ‘/backup/snapshot/HOST/snapshot.001′, where ‘HOST’ is your server’s hostname. If the folder ‘snapshot.001′ exists already it is rotated to ‘snapshot.002′ and so on, up to ‘snapshot.512′, thereafter it is removed. So if you create one backup per night, for example with a cronjob, then this retention policy gives you 512 days of retention. This is useful but this can require to much disk space, that is why we have included a non-linear distribution policy. In short, we keep only the oldest backup in the range 257-512, and also in the range 129-256, and so on. This exponential distribution in time of the backups retains more backups in the short term and less in the long term; it keeps only 10 or 11 backups but spans a retention of 257-512 days.
In the following table you can see on each column the different steps of the rotation, where each column shows the current set of snapshots (limited from snapshot.1 to snapshot.16 in this example):

1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2
    3       3       3       3       3       3       3       3
4       4       4       4       4       4       4       4       4
    5               5               5               5
        6               6               6               6
            7               7               7               7
8               8               8               8               8
    9                               9
        10                              10
            11                              11
                12                              12
                    13                              13
                        14                              14
                            15                              15
16                              16                              16

To save more disk space, ‘rsync’ will make hard links for each file of ‘snapshot.001′ that already existed in ‘snapshot.002′ with identical content, timestamps and ownerships.
For example, the following example creates a backup and then use commands to let you see the used disk space of the 4 existing backups:

root@server05:~# /backup/snapshot/rsync/rsync-snapshot.sh
2012-09-11_19:07:43 server05: === Snapshot backup is created into /backup/snapshot/server05/snapshot.001 ===
2012-09-11_19:07:43 Testing needed free disk space ... 0 MiB needed.
2012-09-11_19:07:45 Checking free disk space... 485997 MiB free. Ok, bigger than 5000 MiB.
2012-09-11_19:07:45 Checking disk space used by /backup/snapshot/server05 ... 11011 MiB used. Ok, smaller than 20000 MiB.
2012-09-11_19:07:46 Creating folder /backup/snapshot/server05/snapshot.000 ...
2012-09-11_19:07:46 Creating backup of server05 into /backup/snapshot/server05/snapshot.000 hardlinked with  /backup/snapshot/server05/snapshot.001 ...
2012-09-11_19:07:52 Setting recursively immutable flag of /backup/snapshot/server05/snapshot.000 ...
Renaming /backup/snapshot/server05/snapshot.003 into /backup/snapshot/server05/snapshot.004 ...
Renaming /backup/snapshot/server05/snapshot.002 into /backup/snapshot/server05/snapshot.003 ...
Renaming /backup/snapshot/server05/snapshot.001 into /backup/snapshot/server05/snapshot.002 ...
Renaming /backup/snapshot/server05/snapshot.000 into /backup/snapshot/server05/snapshot.001 ...
2012-09-11_19:07:55 Checking free disk space... 485958 MiB free. Ok, bigger than 5000 MiB.
2012-09-11_19:07:55 Checking disk space used by /backup/snapshot/server05 ... 11050 MiB used. Ok, smaller than 20000 MiB.
2012-09-11_19:07:56 server05: === Snapshot backup successfully done in 13 sec. ===
-----------------------------
root@server05:~# du -chslB1M /backup/snapshot/localhost/snapshot.* | column -t
10901  /backup/snapshot/localhost/snapshot.001
10901  /backup/snapshot/localhost/snapshot.002
10901  /backup/snapshot/localhost/snapshot.003
10901  /backup/snapshot/localhost/snapshot.004
0      /backup/snapshot/localhost/snapshot.last
43602  total
-----------------------------
root@server05:~# du -chsB1M /backup/snapshot/localhost/snapshot.* | column -t
10898  /backup/snapshot/localhost/snapshot.001
40     /backup/snapshot/localhost/snapshot.002
45     /backup/snapshot/localhost/snapshot.003
45     /backup/snapshot/localhost/snapshot.004
0      /backup/snapshot/localhost/snapshot.last
11026  total
-----------------------------

We can see that the 4 snapshot backups use 10.9 GB each, so without hard links they would sum to 43 GB; the last command shows on the contrary that the real used size is only 11 GB, thanks to the hard links.
BTW, the following command can be very useful to replace all duplicate files with hard links to the first file in each set of duplicates, even if they have different name or path:

chattr -fR -i /backup/snapshot/localhost/snapshot.*
fdupes -r1L /backup/snapshot/localhost/snapshot.*

A good tutorial on how to use the ‘rsync’ command is available here:
http://www.thegeekstuff.com/2010/09/rsync-command-examples/
 

When called with a remote hostname as parameter, the script performs a snapshot backup via the network. This can be very useful for a DRP (Disaster Recovery Plan), in order to have a servers’ farm replicated every night to a secondary site. In addition to that you could implement a continuous replication of the databases for example. The ‘BWLIMIT’ can then be changed inside the shell-script to limit here the network bandwidth usage and the disk I/O overhead; it can help so to moderate the performance impact and avoid any slow down on critical production servers.
Other variables can also be modified at the beginning of the script, either as a global setting or specific tuning for some servers; a ‘BACKUPSERVER’ section is already provided for this purpose and let you tune specific settings for the remote central backup server:

HOST_PORT=22                            #port of source of backup
SCRIPT_PATH="/backup/snapshot/rsync"
SNAPSHOT_DST="/backup/snapshot"         #destination folder
NAME="snapshot"                         #backup name
LOG="rsync.log"
MIN_MIBSIZE=5000                        #older snapshots (except snapshot.001) are removed if free disk <= MIN_MIBSIZE. the script may exit without performing a backup if free disk is still short.
OVERWRITE_LAST=0                        #if free disk space is too small, then this option let us remove snapshot.001 as well and retry once
MAX_MIBSIZE=20000                       #older snapshots (except snapshot.001) are removed if their size >= MAX_MIBSIZE. the script performs a backup even if their size is too big.
BWLIMIT=100000                          #bandwidth limit in KiB/s. 0 does not use slow-down. this allows to avoid rsync consuming too much system performance
BACKUPSERVER="rembk"                    #this server connects to all other to download filesystems and create remote snapshot backups
MD5LIST=0                               #to compute a list of md5 integrity signatures of all backuped files, need 'rsync-list.sh'
CHATTR=1                                #to use 'chattr' command and protect the backups again modification and deletion
DU=1                                    #to use 'du' command and calculate the size of existing backups, disable it if you have many backups and it is getting too slow (for example on BACKUPSERVER)
SOURCE="/"                              #source folder to backup

if [ "${HOST_LOCAL}" == "${BACKUPSERVER}" ] #if we are on BACKUPSERVER then do some fine tuning
then
  MD5LIST=1
  MIN_MIBSIZE=35000 #needed free space for chunk-file tape-arch.sh
  MAX_MIBSIZE=12000
  DU=0 # NB: 'du' is currently disabled on BACKUPSERVER for performance reasons
elif [ "${HOST_LOCAL}" == "${HOST_SRC}" ] #else if we are on a generic server then do other some fine tuning
then
  if [ "${HOST_SRC}" == "ZRHSV-TST01" ]; then MIN_MIBSIZE=500; CHATTR=0; DU=0; MD5LIST=0; fi
fi

To make the backup server able to connect via ‘ssh’ to the target servers without interactive entering of a password, you should create a ‘ssh’ host-key with empty passphrase ‘/root/.ssh/rsync_rsa’ and copy the public key to the target servers:

#on each targetserver:
mkdir -p ~root/.ssh/
chown root:root ~root/.ssh/
chmod 700 ~root/.ssh/
touch ~root/.ssh/authorized_keys
chown root:root ~root/.ssh/authorized_keys
chmod 600 ~root/.ssh/authorized_keys
#update manually /etc/ssh/sshd_config to have 'AllowUsers root'
service ssh reload

#on the backupserver, create the key with an empty passphrase:
ssh-keygen -f ~/.ssh/rsync_rsa
#and upload the public key to the targetserver:
MYIP=$(hostname -i) #assign here the backupserver's external IP if necessary
echo "from="${MYIP%% *}",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,command="rsync ${SSH_ORIGINAL_COMMAND#* }" $(ssh-keygen -yf ~/.ssh/rsync_rsa)" | ssh targetserver "cat - >>~/.ssh/authorized_keys"

Note that the ‘command=’ restriction (http://larstobi.blogspot.ch/2011/01/restrict-ssh-access-to-one-command-but.html) will not apply if ‘/etc/sshd_config’ has already a ‘ForceCommand’ directive.
This central backup server could also be used to centralize the administration of all other servers via pdsh/ssh (LINK).
 

Because the script does not freeze the filesystem during its operation, there is no guaranty that the snapshot backup will be a strict snapshot, in other words the files will not be copied at the exact same moment. This is usually not an issue, except for databases. In order to keep the consistency of a database, you should follow the instructions of http://www.postgresql.org/docs/9.1/static/continuous-archiving.html and http://www.anchor.com.au/blog/documentation/better-postgresql-backups-with-wal-archiving/.

The following example applies for PostgreSQL 9.1 on Debian:

#sudo -u postgres mkdir /var/lib/postgresql/9.1/main/wal_archive
#sudo -u postgres chmod 700 /var/lib/postgresql/9.1/main/wal_archive

#vi /etc/postgresql/9.1/main/postgresql.conf
# - Archiving -
#archive_mode = on               # allows archiving to be done
#                                # (change requires restart)
#archive_command = 'test ! -f /var/lib/postgresql/9.1/main/backup_in_progress || (test ! -f /var/lib/postgresql/9.1/main/wal_archive/%f && cp %p /var/lib/postgresql/9.1/main/wal_archive/%f)'           # command to use to archive a logfile segment
#archive_timeout = 0            # force a logfile segment switch after this
#                                # number of seconds; 0 disables



#check if postgresql is running:
if sudo -u postgres /usr/lib/postgresql/9.1/bin/pg_ctl -D /var/lib/postgresql/9.1/main/ status
then
  touch /var/lib/postgresql/9.1/main/backup_in_progress
  #freeze posgresql writing (all writes will go only in pg_xlog WAL-files), in order to make a clean backup at filesystem-level:
  sudo -u postgres psql -c "SET LOCAL synchronous_commit TO OFF; SELECT pg_start_backup('rsync-snapshot', true);"
fi

#perform the backup
/backup/snapshot/rsync/rsync-snapshot.sh

#check if postgresql is running:
if sudo -u postgres /usr/lib/postgresql/9.1/bin/pg_ctl -D /var/lib/postgresql/9.1/main/ status
then
  #unfreeze posgresql writing:
  sudo -u postgres psql -c "SET LOCAL synchronous_commit TO OFF; SELECT pg_stop_backup();"
  rm /var/lib/postgresql/9.1/main/backup_in_progress
  chattr -R -i /backup/snapshot/localhost/snapshot.001/var/lib/postgresql/9.1/main &>/dev/null
  mv /var/lib/postgresql/9.1/main/wal_archive/* /backup/snapshot/localhost/snapshot.001/var/lib/postgresql/9.1/main/pg_xlog/
  rm /backup/snapshot/localhost/snapshot.001/var/lib/postgresql/9.1/main/{backup_in_progress,backup_label}
  chattr -R +i /backup/snapshot/localhost/snapshot.001/var/lib/postgresql/9.1/main &>/dev/null
fi

#NB:
#  -If you need to re-create a standby server while transactions are waiting, make sure that the commands to run pg_start_backup() and pg_stop_backup() are run in a session with synchronous_commit = off, otherwise those requests will wait forever for the standby to appear.
#  -see also 'pg_dump' and 'pg_basebackups' commands.

It is even possible to freeze an ext3/ext4 filesystem before backuping, but this is quite dangerous, because all processes that try to write on it will get frozen until you unfreeze the filesystem.
You should therefore avoid using this on production server! But for the sake of information here are the steps to freeze the filesystem mounted on ‘/mnt/folder’ during 30 seconds on Debian:

wget ftp://ftp.kernel.org/pub/linux/utils/util-linux/v2.22/util-linux-2.22-rc2.tar.gz
tar xfz util-linux-2.22-rc2.tar.gz
cd xfz util-linux-2.22-rc2
aptitude install ncurses-dev libncurses5-dev mkcramfs cramfsprogs zlib1g-dev libpam-dev libpam0g-dev
./configure
make
make clean
man -l sys-utils/fsfreeze.8
./fsfreeze -f /mnt/folder && sleep 30 && ./fsfreeze -u /mnt/folder

I hope this ‘rsync-snapshot.sh’ script can be useful to you! ^_^
Another script will be posted on the blog soon to show you how to archive those snapshot backups on tapes using ‘tar’ with encrypted split chunks of data.

Schreiben Sie einen Kommentar

Ihre E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert