| ... | ... | @@ -1,13 +1,15 @@ | 
| 1 | 1 |  {% if site.google_analytics_tracking_id %} | 
| 2 | - <script type="text/javascript"> | |
| 3 | - var _gaq = _gaq || []; | |
| 4 | -    _gaq.push(['_setAccount', '{{ site.google_analytics_tracking_id }}']); | |
| 5 | - _gaq.push(['_trackPageview']); | |
| 2 | +<script type="text/javascript"> | |
| 6 | 3 |  | 
| 7 | -    (function() { | |
| 8 | -      var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; | |
| 9 | -      ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; | |
| 10 | -      var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); | |
| 11 | - })(); | |
| 12 | - </script> | |
| 4 | + var _gaq = _gaq || []; | |
| 5 | +  _gaq.push(['_setAccount', '{{ site.google_analytics_tracking_id }}']); | |
| 6 | + _gaq.push(['_trackPageview']); | |
| 7 | + | |
| 8 | +  (function() { | |
| 9 | +    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; | |
| 10 | +    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; | |
| 11 | +    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); | |
| 12 | + })(); | |
| 13 | + | |
| 14 | +</script> | |
| 13 | 15 |  {% endif %} | 
| 14 | 16 | new file mode 100644 | 
| ... | ... | @@ -0,0 +1,135 @@ | 
| 0 | +--- | |
| 1 | +layout: post | |
| 2 | +title: "Encrypted remote backup with rsync and dm-crypt: Part 1/2" | |
| 3 | +date: 2013-02-20 22:41 | |
| 4 | +comments: true | |
| 5 | +categories: [server, shell] | |
| 6 | +cover: /images/cover/avatar.png | |
| 7 | +keywords: rsync, backup, timemachine, linux, ssh, hardlinks, encrypt | |
| 8 | +description: Backing up data to server | |
| 9 | +--- | |
| 10 | + | |
| 11 | +# Choose the right tool | |
| 12 | + | |
| 13 | +Rsync is an ultimate tool for backup purposes. It offers transmitting data | |
| 14 | +remotely and securely over SSH. Also, it offers ```--link-dest``` option which | |
| 15 | +guarantees files are not duplicated in a filesystem thanks to [hard links](http://en.wikipedia.org/wiki/Hard_link). | |
| 16 | +By the way, the same way it does proprietary [Apple Time Machine](http://www.apple.com/osx/apps/#time-machine). | |
| 17 | + | |
| 18 | +# Start writing a script | |
| 19 | + | |
| 20 | +Sure, we will use many more other useful options, not just ```--link-dest```. Here's a simplified version of my | |
| 21 | +backup script: | |
| 22 | + | |
| 23 | +{% codeblock lang:bash %}{% raw %} | |
| 24 | +#!/bin/env bash | |
| 25 | +date=`date "+%Y-%m-%d_%H:%M"` | |
| 26 | + | |
| 27 | +rsync -av \ | |
| 28 | + --delete \ | |
| 29 | + --delete-excluded \ | |
| 30 | + --compress-level=9 \ | |
| 31 | + --numeric-ids \ | |
| 32 | + --rsync-path="sudo rsync" \ | |
| 33 | + --exclude-from=/root/.rsync/home-cinan \ | |
| 34 | + --link-dest=/mnt/current-backup/home \ | |
| 35 | + /home/cinan -e ssh | |
| 36 | + sync-user@machine:incomplete_backup-$date/home/ | |
| 37 | + | |
| 38 | +mv incomplete_backup-$date backup-$date && rm -rf current-backup && ln -s backup-$date current-backup | |
| 39 | +{% endraw%}{% endcodeblock %} | |
| 40 | + | |
| 41 | +<!-- more --> | |
| 42 | + | |
| 43 | +Explanation: | |
| 44 | + | |
| 45 | +- ```-a``` typical option for backup, archive mode (recursive copy, copy symlinks | |
| 46 | + as symlinks, preserve ownerships). Preserving ownerships works only if rsync | |
| 47 | + is run on our backup machine with root rights -- see ```--rsync-path option```. | |
| 48 | +- ```-v``` be verbose, but not too much | |
| 49 | +- ```--delete``` delete extraneous files from destination directories | |
| 50 | +- ```--delete-excluded``` also delete excluded files from destination directories | |
| 51 | +- ```--compress-level=9``` highest compression level (CPU is fast, network ain't) | |
| 52 | +- ```--numeric-ids``` don't map uid/gid to users/groups | |
| 53 | +- ```--rsync-path=<some-path>``` to preserve ownerships we need to run rsync as | |
| 54 | + root on our backup machine. | |
| 55 | +- ```--exclude-from=<some-path>``` exclude some directories and files from backup | |
| 56 | +- ```--link-dest=<some-path>``` magic. Hardlink unchanged files to files in | |
| 57 | + \<some-path\> directory | |
| 58 | +- ```/home/cinan``` directory to backup | |
| 59 | +- ```-e ssh``` specify the remote shell to use | |
| 60 | +- ```<path>``` directory where will be backup saved | |
| 61 | +- ```mv ... & rm ... && ln``` mark the newest backup as complete. Delete the old backup | |
| 62 | + link directory (don't worry, you won't lose your data, it's just a symlink) | |
| 63 | + and symlink the newest backup to current-backup directory (useful for future | |
| 64 | + backups which will use this directory in ```--link-dest```). | |
| 65 | + | |
| 66 | +*On the bottom of this article I'll show complete script.* | |
| 67 | + | |
| 68 | +Keep an eye on slashes! There's a huge difference between /home/cinan and | |
| 69 | +/home/cinan/ (source directory). Without the final slash, rsync will copy the | |
| 70 | +directory in its entirety. With the trailing slash, it will copy the contents of | |
| 71 | +the directory but won't recreate the directory. | |
| 72 | + | |
| 73 | +# Little bit of security | |
| 74 | + | |
| 75 | +Before running the script, sacrifice more of your time for sake of security. The | |
| 76 | +```--rsync-path``` option can be quite dangerous. On server-side setup rrsync first. | |
| 77 | +Basically it allow sync-user to run rsync in defined something-like-chroot | |
| 78 | +environment. Read more about rrsync [here](http://www.v13.gr/blog/?p=216). | |
| 79 | + | |
| 80 | +Now you can run the script. First time it can take longer time but future backups are | |
| 81 | +incremental, so rsync will transmit only changed files. | |
| 82 | + | |
| 83 | +# Impact of hard links | |
| 84 | + | |
| 85 | +Every time the backup process create a new directory called backup-$date. | |
| 86 | +Thanks to that it's really easy to get files from 22nd Oct 2013 or 29 Nov 2013. | |
| 87 | +Also, it is space efficient solution because of hardlinks. If file ```dir/a``` | |
| 88 | +exists in backup from 1st Jan 2013 and also from 10th Feb 2013, data of the file | |
| 89 | +is saved on HDD only once. However, it doesn't mean if you delete a file from January | |
| 90 | +backup directory then the same file will be deleted in February backup directory -- | |
| 91 | +the February backup file preserves. | |
| 92 | + | |
| 93 | +Look at directory sizes. First backup is the biggest, other directory sizes are just diffs. | |
| 94 | +{% img center /images/backup-2.png Size of my backups %} | |
| 95 | + | |
| 96 | +# More complete script | |
| 97 | + | |
| 98 | +It isn't very robust, but it works and I'm happy with it. | |
| 99 | + | |
| 100 | +{% codeblock lang:bash %}{% raw %} | |
| 101 | +cmd="rsync -av \ | |
| 102 | + --delete \ | |
| 103 | + --delete-excluded \ | |
| 104 | + --compress-level=9 \ | |
| 105 | + --numeric-ids \ | |
| 106 | + --rsync-path=\"sudo rsync\" \ | |
| 107 | + --exclude-from=/root/.rsync/EXCLUDE_FROM \ | |
| 108 | + --link-dest=~/current-backup/TO \ | |
| 109 | + FROM -e ssh backup@cinan.remote:incomplete_backup-$date/TO/" | |
| 110 | + | |
| 111 | +#tuples EXCLUDE_FROM, FROM, TO | |
| 112 | +paths=( "home-cinan-exclude" "\/home\/cinan" "home" | |
| 113 | + "root-exclude" "\/root\/" "root" | |
| 114 | + "empty" "\/var\/spool" "var" | |
| 115 | + "empty" "\/var\/lib\/pacman" "var\/lib" | |
| 116 | + "empty" "\/boot\/" "boot" | |
| 117 | +) | |
| 118 | +let "paths_peak=${#paths[@]} / 3 - 1" | |
| 119 | + | |
| 120 | +for i in `seq 0 $paths_peak`; do | |
| 121 | +  EXCLUDE_FROM=${paths[$i*3+0]} | |
| 122 | +  FROM=${paths[$i*3+1]} | |
| 123 | +  TO=${paths[$i*3+2]} | |
| 124 | + | |
| 125 | + ssh backup@cinan.remote "mkdir -p incomplete_backup-$date/$TO" | |
| 126 | + eval `echo $cmd | sed -e 's/EXCLUDE_FROM/'"$EXCLUDE_FROM"'/;s/FROM/'"$FROM"'/;s/TO/'"$TO"'/g'` 2>> /tmp/system_backup_errors | |
| 127 | +done | |
| 128 | + | |
| 129 | +ssh backup@cinan.remote "mv incomplete_backup-$date backup-$date && rm -rf current-backup && ln -s backup-$date current-backup" | |
| 130 | +{% endraw %}{% endcodeblock %} | |
| 131 | + | |
| 132 | +# I want my data encrypted | |
| 133 | + | |
| 134 | +Check out 2/2 part. |