Backups
Backup Rule of Three: always have 3 copies:
- 1 remote system in case of disaster
- 2 local copies, on different media
A backup is sometimes called an archive. An archive is a group of files with associated metadata. It is a copy of data that can be restored sometime in the future if the data becomes corrupted. You need to consider the following:
- Backup type
- Compression methods
- Utilities that will help the most
Understanding backup types
- System image
- A clone, a copy of the OS binaries, config files, and whatever you need to boot.
- Full
- Copy of all data, ignoring its modification date. Quickly restores system data, but takes a long time to create the backup.
- Incremental
- Copy of data that has been modified since the last backup operation, by comparing timestamps. This method is quick, but might take a long time to actually restore.
- Differential
- Copy of all data that changed since last full backup. Good balance between full and incremental backup.
- Snapshot
- Hybrid approach - a full (usually read-only) copy of data is made to backup media. Then pointers (ex: hard links) are employed to create a reference table linking the backup data with the original data. During next backup, only modified files are copied to backup media, and the pointer reference table is copied and updated.
You can go back to any point in time (restore point) and restore the data from there. Very efficient and takes less space and processing power.
rsync
uses the snapshot approach. - Snapshot clone
- After a snapshot is created, it is cloned. Useful in high I/O environments. It is modifiable and mountable, so you can use it as disaster recovery.
Backup files
/var/backups
:
passwd.bak
group.bak
shadow.bak
fstab.bak
Types of backups
A differential backup takes an initial backup, and all subsequent backups are the diff between the current and initial backups. For example, backup A is the initial backup, then backup B is the diff between A and B, backup C is the diff between A and C, and so on.
An incremental backup takes an initial backup, and all subsequent backups are the diff between the previous backup. For example, backup A is the initial backup, then B is the diff between A and B, backup C is the diff between B and C, and so on.
Partition/filesystem backups
- Run
df -h
to figure out which partitions are real and which are pseudo. Pseudo partitions use 0 storage.
Object backup
Compression
Backup config files
This script backs up any configuration files provided as arguments to the for
loop. It compares the files in /var/backups/
with the files in /etc
. If the files are not identical, they are copied to /var/backups/
:
#!/bin/sh
cd /var/backups || exit 0
for FILE in passwd group shadow gshadow; do
test -f /etc/$FILE || continue # if no backup file, next loop arg
# -s suppresses output
cmp -s $FILE.bak /etc/$FILE && continue # if files match, next loop arg
cp -p /etc/$FILE $FILE.bak && chmod 600 $FILE.bak # overwrite old file w/new, root perms
done
rsync
Remote sync. Copies large files quickly over the network. It copies file updates, and files that do not exist in the destination directory.
rsync [OPTION]... SOURCE DEST
-a # archive mode, equivalent to -rlptgoD (dir tree backups)
-D # retain Device and special files
-g # retain file group
-h # human-readable numeric output
-l # copy symbolic links
-o # retain file owner
-p # retain file perms
--progress # display progress of file copy
-r # recursive
--stats # display file transfer stats
-t # retain file's modification time
-v # verbose
# copy to /home dir on remote
rsync -av Downloads/filename.ext linuxuser@ubuntu-24:/home/linuxuser
# 1. send all files in pwd to remote dir
rsync -av * linuxuser@ubuntu-24:syncdirectory
sending incremental file list
file1
file2
file3
file4
file5
sent 317 bytes received 111 bytes 856.00 bytes/sec
total size is 0 speedup is 0.00
# 2. Create new file and edit file3, rsync sends diff
rsync -av * linuxuser@ubuntu-24:syncdirectory
sending incremental file list
file3
newfile
sent 257 bytes received 54 bytes 207.33 bytes/sec
total size is 23 speedup is 0.07
tar full and incremental backups
tar
views full and incremental backups in levels:
- level 0 includes all files
- level 1 is first incremental backup
- level 2 is second incremental backup, etc…
tar [OPTIONS...] [FILENAME]...
-d # compare tar archive file members with external files
-t # display tar archive file's contents (members)
-W # verify each file as it is processed. Can't use with compression.
# 1. creates snapshot .snar file w timestamp metadata to create backups
tar -g FullArchive.snar -Jcvf Project42.txz Project4?.txt
Project42.txt
Project43.txt
Project44.txt
Project45.txt
Project46.txt
# 2. verify created
ls FullArchive.snar Project42.txz
FullArchive.snar Project42.txz
# 3. Update file
echo 'Answer to everything' >> Project42.txt
# 4. create incremental backup. Project42_Inc.txz contains only Project42.txt
# because its the only file that was modified since the previous backup.
tar -g FullArchive.snar -Jcvf Project42_Inc.txz Project4?.txt
Project42.txt
# view tarball files/members
tar -tf Project4x.tar.gz
Project42.txt
Project43.txt
Project44.txt
Project45.txt
Project46.txt
# compare archive files against current files
$ tar -df Project4x.tar.gz
Project42.txt: Mod time differs
Project42.txt: Size differs
# verify backup after archive is created. can't compress, must
# be in next step.
tar -Wcvf ProjectVerify.tar Project4?.txt
Project42.txt
...
Verify Project42.txt
...
tar restore
Basically same as compress command, but sub the -c
for -x
:
tar [OPTIONS...] [FILENAME]...
-x # extract files from tarball or archive and place in cwd
-z # decompresses with gunzip
-j # decompresses with bunzip2
-J # decompresses with unxz
# extract gzip tarball (tarball is not removed)
tar -zxvf Project4x.tar.gz
Project42.txt
Project43.txt
Project44.txt
Project45.txt
Project46.txt
dd
Schedule backups
cron
Cron schedules jobs and tasks:
cron.[hourly|daily|weekly|monthly|yearly]
: files in these directories run at times specified by dir name.cron.d
: files in this dir have time that defines when the job runs. Add files here to run at specified times.crontab
is overwritten during upgrades, so don’t update.Do not add files in
cron.d
–they are overwritten during upgrades.
ls -lF | grep cron
-rw-r--r-- 1 root root 419 Nov 18 01:39 anacrontab
drwxr-xr-x 2 root root 4096 Nov 18 01:32 cron.d/
drwxr-xr-x 2 root root 4096 Nov 18 01:29 cron.daily/
drwxr-xr-x 2 root root 4096 Aug 27 14:26 cron.hourly/
drwxr-xr-x 2 root root 4096 Nov 18 01:29 cron.monthly/
-rw-r--r-- 1 root root 1136 Aug 27 14:26 crontab
drwxr-xr-x 2 root root 4096 Nov 18 01:29 cron.weekly/
drwxr-xr-x 2 root root 4096 Aug 27 14:26 cron.yearly/
####### TIME FORMATTING
min hour dayofmonth month dayofweek command
# 10:15am each day
15 10 * * * /full/path/to/program.sh
# 4:15pm every Monday (0 - Sun, 6 - Sat)
15 16 * * 1 /full/path/to/program.sh
# 12 noon first day of each month
00 12 1 * * /full/path/to/program.sh
# list existing cron table
crontab -l
no crontab for linuxuser
####### COMMANDS
crontab -l
# Edit this file to introduce tasks to be run by cron.
...
# m h dom mon dow command
54 22 * * 1 /home/linuxuser/cron_echo.sh > cron.out
# add entry
crontab -e
(opens cron table in vi)
47 5 * * * linuxuser /home/linuxuser/scripts/upgrade.sh # run upgrade.sh as linuxuser at 5:47 AM daily
# delete user's crontab file
crontab -r
anacron
Schedule irregular jobs for a machine–such as your laptop–that doesn’t run 24/7.
- runs relative to most recent boot time, not absoulte time
- might have to install
- has priority over
cron
- saves job status info to
/var/spool/anacron/
# install, creates /etc/anacrontab
sudo apt install anacron
cat anacrontab
# /etc/anacrontab: configuration file for anacron
# See anacron(8) and anacrontab(5) for details.
SHELL=/bin/sh
HOME=/root
LOGNAME=root
# These replace cron's entries
...
# user-added entries
# interval mins-after-boot job-IDer command
1 15 daily_apt /home/linuxuser/scripts/upgrade.sh # run upgrade.sh every day (1) 15 mins after boot