The Funtoo Linux project has transitioned to "Hobby Mode" and this wiki is now read-only.
Package:Dirvish Backup
Dirvish Backup
We welcome improvements to this page. To edit this page, Create a Funtoo account. Then log in and then click here to edit this page. See our editing guidelines to becoming a wiki-editing pro.
Introduction
Dirvish is an excellent automatic, no-nonsense, no gui needed way to backup your drives. Using simple text-based configuration files, you set it once and forget it.
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. -NeddySeagoon
But there are more reasons than hard drive failures Neddy. A momentary mental blip can delete an important file or a configuration or a custom program that has taken months to perfect. A simple copy from your nice safe backup can save a lot of heartache.
Dirvish saves images of partitions by date of backup. It uses hard links. In other words, the 1st initial backup is truly a 1:1 image of your drive. After that, backups of unchanged files are simply hard links to the original image. Only changed files are stored again. This means you can have about 3 weeks of daily backups of your partition (actual experience) with a backup drive about 2.5 times the size of that partition. Very efficient. And it is utterly transparent to the user. If you need to retrieve a file you won't know (or care!) if it's an actual file or a hard link. Directory listings look as though all of the files are truly there, for every date backed up.
Emerging the package
To install dirvish, run the following command:
root # emerge -av dirvish
Configuring
The 1st thing you will want to do is dedicate a partition to be your backup drive. I suggest, as noted above, sized about 2-3 times the total size of all partitions you intend to backup. In this partition you will want to make subdirectories with meaningful names that YOU understand to refer to the partition they contain. For example if you were backing up your Funtoo laptop that had root, home and boot partitions you might want to make directories on the backup drive named laptop-root, laptop-home and laptop-boot. Each of these subdirectories will have a dirvish subdirectory in them that will contain a configuration file that controls what does/does not get backed up. So for this specific example this would be a quick and easy way to create all of these subdirectories:
root # cd /<mounted backup drive partition> root # install -d laptop-boot/dirvish root # install -d laptop-root/dirvish root # install -d laptop-home/dirvish
There is no limit, so add the partitions of any and all computer partitions you wish to backup. The fact that there is no gui for dirvish is actually a plus, there is nothing to learn or get confused about. The backups are controlled by one text file found in /etc/dirvish called master.conf (global settings) and one file in /<backup drive>/<meaningful partition name>/dirvish called dirvish.conf. Rather than boring you with a detailed explanation of /etc/dirvish/master.conf I will list a modified version of mine here and discuss it's important points afterwards.
/etc/dirvish/master.conf
- My dirvish master.conf file# /etc/dirvish/master.conf:
runall:
# funtoo-root 16:00
# funtoo-home 16:00
# funtoo-boot 16:00
laptop-boot 16:00
laptop-root 16:00
laptop-home 16:00
dockstar 16:00
pogo1 16:00
optware 16:00
# disslowdog 16:00
bank:
/mnt/auto/backup
image-default: %Y%m%d
log: gzip
index: gzip
exclude:
/etc/mtab
/var/lib/nfs/*tab
/var/cache/apt/archives/*.deb
/.kde/share/cache/*
/.firefox/default/*/Cache/*
/usr/src/**/*.o
lost+found/
*.iso
*.avi
*.mpeg
*.mpg
*.mov
*.tmp
expire-default: +14 days
#expire-rule:
# MIN HR DOM MON DOW STRFTIME_FMT
# * * * * 1 +3 months
# * * 1-7 * 1 +1 year
# * * 1-7 1,4,7,10 1
# * 10-19 * * * +4 days
# * * * * 2-7 +15 days
Formatting in these config files IS IMPORTANT! Note that some items are indented. This is important! Dirvish will ignore them if you don't indent.
In the "runall:" section you list the partitions you want backed up. In dirvish these are called vaults. For example you see I have some partitions(vaults) commented out. That's one machine that's currently down (broken) and another machine (disslowdog) that I only backup a couple of times a week. Note the times given after each entry, 16:00. This is the image_time, the timestamp placed on each backup after it is created. I find it convenient to stamp them that way.
Next comes "bank:" This is the mounting you will use for your backup partition.
"Image_default:" is the default timestamp format that each day's backup will literally be called, as it is placed in your "meaningful-to-you" directories. For example, here's the actual listing of my disslowdog vault (boot, home and root all on one partition in this case):
root # ls /mnt/auto/backup/disslowdog 20140917 20140922 20140924 20140928 20141005 20141008 dirvish root # ls /mnt/auto/backup/disslowdog/20141008 index.gz log.gz summary tree root # ls /mnt/auto/backup/disslowdog/20141008/tree boot etc lib opt var bin home net root sbin usr
"log:" and "index:" simply tell it the compression method (if any) you want used for log and index files stored in the backup directories.
"exclude:"
you'll want to for example exclude any *.mov, *.iso, *.avi, etc. files. These types of files can be monstrously large, and they don't change from day to day, so it makes more sense to back them up separately from your daily backup, where it would slow things down daily.
"expire-default:" In the same sense that Funtoo is a rolling-release, this is a rolling-backup. In my case on the 15th day the original image is purged, dirvish completely takes care of this, along with moving actual images forward replacing hard links where necessary. So in this case, after 2 weeks of backups, there's always one oldest backup falling of the edge of the earth daily, to make room for newer backups.
"expire-rule:" As you can see, I don't use this function. It allows you to make a finer-grained adjustment with regards to the expiring of older backups. Quite sophisticated really, allowing you to do things such as keeping all Friday backups for 3 months.
There are a ton of other options, see man dirvish.conf to view all of them. The above will be sufficient for the average user I suspect.
Almost done!
Now you must create a dirvish.conf file for each of your <meaningful-to-you>/dirvish directories (vaults), specific to each partition you are backing up. Let's look at one of my typical dirvish.conf's:
dirvish.conf
- a typical dirvish.conf file# dirvish.conf:
client: <hostname of computer this partition is on>
tree: /
xdev: 0
rsh: ssh -i /root/.ssh/id_rsa <hostname>
index: gzip
image-default: %Y%m%d
exclude: #indent!!!
+ /usr
+ /usr/src
+ /usr/src/linux*
+ /usr/src/linux*/.config
boot
home
dev
mnt
proc
run
sys
tmp
media
var/cache/man
root/.ccache
var/tmp/ccache
var/lib/alsa
var/lib/preload
var/lib/upower
var/lock/lvm
etc/resolv.conf
etc/NetworkManager/system-connections/Auto 114
etc/lvm/cache/.cache
etc/network/run/ifstate
/.cache
/.mozilla/firefox/*/Cache
root/.bash_history
root/.recently-used.xbel
root/.local/share/Trash**
var/tmp/**
.trash/**
usr/portage/**
usr/src/**
root/.ccache
root/.cache
Formatting in these config files IS IMPORTANT! Note that some items are indented. This is important! Dirvish will ignore them if you don't indent.
I shall describe this config file in the same way as the master.conf example.
"client:" This is the actual hostname of the computer containing the partition. Or IP address if you prefer. ssh will use this to connect to it.
"tree:" The base root tree of the partition you wish to back up. This is an example of the root partition of one of my computers. I always give home it's own partition, so if you used tree:/home/joe, that would be the base partition it would start from, you couldn't try to back up /home from there. Also any exclude patterns would be understood to start from that base.
"xdev:" This is a boolean true/false telling dirvish whether it's ok to cross mount points. I really don't think I use it anywhere.
"rsh:" This is the definition of the ssh command you would use to "ssh in" to that machine. Self explanatory I think.
"index:" and "image_default:" Already explained in the master.conf file, just an opportunity to fine tune them individually for each partition if desired.
"exclude:" Also previously described and self-explanatory. However, there are some things to note here: Here you have the opportunity for fine-grained control, obviously we are not going to backup dev, sys, and similar directories, but you probably don't want to put such things in your master.conf file, they are more appropriate here at the partition level.
Notice also that the first few excludes have a + before them. This makes them includes
rather than excludes
. Let's take the case of + /usr/src. In this case I want to backup the directory structure of /usr/src, so I make it an include. The directory(s) are created on the backup. But I really don't want anything in those directories, only the structure. So further down the exclude list (3rd from the bottom) I have usr/src/**. This is an exclude, there is no + preceding it. The double stars match anything, meaning anything in /usr/src does not get backed up.
Once you've created a dirvish.conf file for each partition you want to backup you are done! Forever! OK, almost. You may wish to fine tune your configuration for the 1st couple of weeks, weeding out things like browser caches (careful though, sometimes they are needed if a complete reinstall is necessary), anything largish that doesn't change is a good candidate for weeding. Caches in general. And conversely, make sure you aren't excluding something that may be important.
You must create the initial backups by hand, it's painless though. Using our previous example of laptop-boot, laptop-root and laptop-home you do it like this:
root # dirvish --init --vault laptop-boot root # dirvish --init --vault laptop-root root # dirvish --init --vault laptop-home
From then on, there is a convenient script in /etc/dirvish called dirvish-cronjob. It's just what the name sounds like, call it from your cron daemon whenever you want your partitions backed up. For example you could put this in your fcrontab:
%daily 15 03 /etc/dirvish/dirvish-cronjob
And voila! While you sleep your partitions are backed up. Assuming you are asleep at 3:15 in the morning.
That should get you started. There is much good info in dirvish.conf and also at some of the links at http://www.dirvish.org/. Here we've only briefly touched on the configuration.
Security
You may wish to consider the implications of allowing root ssh logins. Most experts recommend turning off this ability, i.e. set PermitRootLogin no in /etc/ssh/sshd_config. This creates a problem for dirvish, because it DOES login as root through ssh. However there is a secure and (somewhat) easy way of circumventing this issue. This is done by setting PermitRootLogin forced-commands-only in /etc/sshd_config. It will allow ONE command per key to be executed as root on the client computer (the one being backed up). This gives you the best of both worlds, only the ONE command with the ONE key can execute as root, any other attempted root login will fail.
You are going to intentionally cause dirvish to fail to log in in order to find out the command needed. No need to comprehend, just follow the steps ;)
1. Go ahead and set PermitRootLogin forced-commands-only in /etc/ssh/sshd_config and restart sshd (/etc/init.d/sshd restart) on the client (computer you wish to log into) |
2. Now create a temporary ssh script on the SERVER to find out the correct rsync command to be run on the client. |
root # echo -e '#!/bin/sh\necho $@ > /tmp/rsync' > /tmp/ssh root # chmod +x /tmp/ssh
3. Run Dirvish so we can find the command (it will surely fail in a few seconds) and then print out the results and copy to your clipboard. |
root # dirvish --vault <client vault> --init root # cat /tmp/rsync
You will see something similar to this:
root # client rsync --server --sender -vlHogDtprx --numeric-ids . /
4. Now edit /root/.ssh/authorized_keys of the CLIENT machine. Paste this in, replacing the rsync error with the one YOU got. |
/root/.ssh/authorized_keys
- a forced command line entrycommand="<YOUR rsync error>",from="<IP Address or hostname of server>",no-port-forwarding,no-X11-forwarding,no-agent-forwarding ssh-rsa <YOUR big long rsa key from the server>
For further clarity, it should look like this:
/root/.ssh/authorized_keys
- an almost real entry with a throw-away keycommand="rsync --server --sender -vlHogDtpre.iLsf --numeric-ids . /",from="mysuperdeluxecomputer",no-port-forwarding,no-X11-forwarding,no-agent-forwarding ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCgFtKBHW5+RZm3e2yj6LnbWC/ZEnFouvDXhbEDnnKO2mdVmJjldut7InPwhfjB8aJszLdCT7V1/fypp4us8HXiOYYU/PWmUbGslNTbuVqDZ9o1xlpjWaanmOllgi0vCPe6ZiifE70HPygvHlAEO8M+sMUmIWmVdOin9u8owExoAYD/SK+xLJR2sTunrUZ1kg0phWHVoCTBnzncIW0lBQZARuU2DD/YdIZOIvYnKjLCYz2Qs/7bN/lw8yMw2nuwquVoJqGSXoX6lzrTwCgMXK93CUEvdIbWtbs6qp5q0Ige2d/1d0WB4MB9vFZMXvAVjfHDnMulKPBt9JqtVXV1Uicf root@mysuperdeluxecomputer
And that's it. You should test it with dirvish --init vault <client vault>. You'll have to repeat these steps with every machine that has PermitRootLogin set to forced-commands-only.
If this is the only root login you use you are good to go, just use your (for example) /root/.ssh/id_rsa.pub key. If you do other root logins from that machine you must create an individual key for each task. Also, if you create a specially named key for this task be sure and change the <vault>/dirvish/dirvish.conf file rsh: variable to reflect the proper key name
Credit for the info in the security section: [1] -- step 5
Dirvish-Cronjob Modifications
While the vanilla dirvish-cronjob script works, it's just a bash script and intentionally made to be modified. So don't be afraid to customize it to your liking. I present mine here, I've been tweaking on it for a few years and it works well for me.
root #!/bin/bash root # root # daily cron job for the dirvish package root # root #BANK="/mnt/auto/backup" root # this line saves having to list the location of the backup mount twice, now only in /etc/dirvish/master.conf BANK=$(sed -n '/bank/{n;n;p;}' /etc/dirvish/master.conf|sed -e 's/^[ \t]*//') START="$(date +%Y%m%d)" if [ ! -x /usr/sbin/dirvish-expire ]; then exit 0; fi if [ ! -s /etc/dirvish/master.conf ]; then exit 0; fi root # check that backup drive is ready if [ ! -d "$BANK" ] then su sputnik -c 'ssh phoenix echo "Backup directory is missing"|festival --tts' root # exit 1 #hey man, where's the email if no backup BANK? try this: exit $? fi root # The meat & taters /usr/sbin/dirvish-expire --quiet /usr/sbin/dirvish-runall --quiet root # check for bum backups due to rsync root # this positively notifies me if there is a problem with any backups root # the standard dirvish email isn't so clear about it root # nothing worse than thinking all's well and it's not, just when you need it VAULT_RSYNC_ERROR="" for VAULT in $(ls -x "$BANK") do if [ -e "$BANK/$VAULT/$START/rsync_error" ]; then VAULT_RSYNC_ERROR="${VAULT_RSYNC_ERROR}${VAULT}\n" fi done if [ -n "$VAULT_RSYNC_ERROR" ]; then # we gots to send somebody email! echo -e "The following dirvish vaults had rsync problems.\nNo backups of these vaults were made.\n\ $VAULT_RSYNC_ERROR on $START" 2>&1 | sed '1!b;s/^/To: someemailaddy@somewhere.com Subject: rsync errors-some backups were faulty\n\n/' | sendmail -t fi root # keep local backup copies of vault configurations, a real bummer to lose those for VAULT in $(ls -x "$BANK") do if [ -e $BANK/$VAULT/dirvish/default.conf ] && [ -e "/etc/dirvish/config_backups/$VAULT/dirvish" ]; then cp $BANK/$VAULT/dirvish/default.conf "/etc/dirvish/config_backups/$VAULT/dirvish" fi done exit $?