... | ... |
@@ -1,13 +1,15 @@ |
1 | 1 |
{% if site.google_analytics_tracking_id %} |
2 |
- <script type="text/javascript"> |
|
3 |
- var _gaq = _gaq || []; |
|
4 |
- _gaq.push(['_setAccount', '{{ site.google_analytics_tracking_id }}']); |
|
5 |
- _gaq.push(['_trackPageview']); |
|
2 |
+<script type="text/javascript"> |
|
6 | 3 |
|
7 |
- (function() { |
|
8 |
- var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; |
|
9 |
- ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; |
|
10 |
- var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); |
|
11 |
- })(); |
|
12 |
- </script> |
|
4 |
+ var _gaq = _gaq || []; |
|
5 |
+ _gaq.push(['_setAccount', '{{ site.google_analytics_tracking_id }}']); |
|
6 |
+ _gaq.push(['_trackPageview']); |
|
7 |
+ |
|
8 |
+ (function() { |
|
9 |
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; |
|
10 |
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; |
|
11 |
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); |
|
12 |
+ })(); |
|
13 |
+ |
|
14 |
+</script> |
|
13 | 15 |
{% endif %} |
14 | 16 |
new file mode 100644 |
... | ... |
@@ -0,0 +1,135 @@ |
0 |
+--- |
|
1 |
+layout: post |
|
2 |
+title: "Encrypted remote backup with rsync and dm-crypt: Part 1/2" |
|
3 |
+date: 2013-02-20 22:41 |
|
4 |
+comments: true |
|
5 |
+categories: [server, shell] |
|
6 |
+cover: /images/cover/avatar.png |
|
7 |
+keywords: rsync, backup, timemachine, linux, ssh, hardlinks, encrypt |
|
8 |
+description: Backing up data to server |
|
9 |
+--- |
|
10 |
+ |
|
11 |
+# Choose the right tool |
|
12 |
+ |
|
13 |
+Rsync is an ultimate tool for backup purposes. It offers transmitting data |
|
14 |
+remotely and securely over SSH. Also, it offers ```--link-dest``` option which |
|
15 |
+guarantees files are not duplicated in a filesystem thanks to [hard links](http://en.wikipedia.org/wiki/Hard_link). |
|
16 |
+By the way, the same way it does proprietary [Apple Time Machine](http://www.apple.com/osx/apps/#time-machine). |
|
17 |
+ |
|
18 |
+# Start writing a script |
|
19 |
+ |
|
20 |
+Sure, we will use many more other useful options, not just ```--link-dest```. Here's a simplified version of my |
|
21 |
+backup script: |
|
22 |
+ |
|
23 |
+{% codeblock lang:bash %}{% raw %} |
|
24 |
+#!/bin/env bash |
|
25 |
+date=`date "+%Y-%m-%d_%H:%M"` |
|
26 |
+ |
|
27 |
+rsync -av \ |
|
28 |
+ --delete \ |
|
29 |
+ --delete-excluded \ |
|
30 |
+ --compress-level=9 \ |
|
31 |
+ --numeric-ids \ |
|
32 |
+ --rsync-path="sudo rsync" \ |
|
33 |
+ --exclude-from=/root/.rsync/home-cinan \ |
|
34 |
+ --link-dest=/mnt/current-backup/home \ |
|
35 |
+ /home/cinan -e ssh |
|
36 |
+ sync-user@machine:incomplete_backup-$date/home/ |
|
37 |
+ |
|
38 |
+mv incomplete_backup-$date backup-$date && rm -rf current-backup && ln -s backup-$date current-backup |
|
39 |
+{% endraw%}{% endcodeblock %} |
|
40 |
+ |
|
41 |
+<!-- more --> |
|
42 |
+ |
|
43 |
+Explanation: |
|
44 |
+ |
|
45 |
+- ```-a``` typical option for backup, archive mode (recursive copy, copy symlinks |
|
46 |
+ as symlinks, preserve ownerships). Preserving ownerships works only if rsync |
|
47 |
+ is run on our backup machine with root rights -- see ```--rsync-path option```. |
|
48 |
+- ```-v``` be verbose, but not too much |
|
49 |
+- ```--delete``` delete extraneous files from destination directories |
|
50 |
+- ```--delete-excluded``` also delete excluded files from destination directories |
|
51 |
+- ```--compress-level=9``` highest compression level (CPU is fast, network ain't) |
|
52 |
+- ```--numeric-ids``` don't map uid/gid to users/groups |
|
53 |
+- ```--rsync-path=<some-path>``` to preserve ownerships we need to run rsync as |
|
54 |
+ root on our backup machine. |
|
55 |
+- ```--exclude-from=<some-path>``` exclude some directories and files from backup |
|
56 |
+- ```--link-dest=<some-path>``` magic. Hardlink unchanged files to files in |
|
57 |
+ \<some-path\> directory |
|
58 |
+- ```/home/cinan``` directory to backup |
|
59 |
+- ```-e ssh``` specify the remote shell to use |
|
60 |
+- ```<path>``` directory where will be backup saved |
|
61 |
+- ```mv ... & rm ... && ln``` mark the newest backup as complete. Delete the old backup |
|
62 |
+ link directory (don't worry, you won't lose your data, it's just a symlink) |
|
63 |
+ and symlink the newest backup to current-backup directory (useful for future |
|
64 |
+ backups which will use this directory in ```--link-dest```). |
|
65 |
+ |
|
66 |
+*On the bottom of this article I'll show complete script.* |
|
67 |
+ |
|
68 |
+Keep an eye on slashes! There's a huge difference between /home/cinan and |
|
69 |
+/home/cinan/ (source directory). Without the final slash, rsync will copy the |
|
70 |
+directory in its entirety. With the trailing slash, it will copy the contents of |
|
71 |
+the directory but won't recreate the directory. |
|
72 |
+ |
|
73 |
+# Little bit of security |
|
74 |
+ |
|
75 |
+Before running the script, sacrifice more of your time for sake of security. The |
|
76 |
+```--rsync-path``` option can be quite dangerous. On server-side setup rrsync first. |
|
77 |
+Basically it allow sync-user to run rsync in defined something-like-chroot |
|
78 |
+environment. Read more about rrsync [here](http://www.v13.gr/blog/?p=216). |
|
79 |
+ |
|
80 |
+Now you can run the script. First time it can take longer time but future backups are |
|
81 |
+incremental, so rsync will transmit only changed files. |
|
82 |
+ |
|
83 |
+# Impact of hard links |
|
84 |
+ |
|
85 |
+Every time the backup process create a new directory called backup-$date. |
|
86 |
+Thanks to that it's really easy to get files from 22nd Oct 2013 or 29 Nov 2013. |
|
87 |
+Also, it is space efficient solution because of hardlinks. If file ```dir/a``` |
|
88 |
+exists in backup from 1st Jan 2013 and also from 10th Feb 2013, data of the file |
|
89 |
+is saved on HDD only once. However, it doesn't mean if you delete a file from January |
|
90 |
+backup directory then the same file will be deleted in February backup directory -- |
|
91 |
+the February backup file preserves. |
|
92 |
+ |
|
93 |
+Look at directory sizes. First backup is the biggest, other directory sizes are just diffs. |
|
94 |
+{% img center /images/backup-2.png Size of my backups %} |
|
95 |
+ |
|
96 |
+# More complete script |
|
97 |
+ |
|
98 |
+It isn't very robust, but it works and I'm happy with it. |
|
99 |
+ |
|
100 |
+{% codeblock lang:bash %}{% raw %} |
|
101 |
+cmd="rsync -av \ |
|
102 |
+ --delete \ |
|
103 |
+ --delete-excluded \ |
|
104 |
+ --compress-level=9 \ |
|
105 |
+ --numeric-ids \ |
|
106 |
+ --rsync-path=\"sudo rsync\" \ |
|
107 |
+ --exclude-from=/root/.rsync/EXCLUDE_FROM \ |
|
108 |
+ --link-dest=~/current-backup/TO \ |
|
109 |
+ FROM -e ssh backup@cinan.remote:incomplete_backup-$date/TO/" |
|
110 |
+ |
|
111 |
+#tuples EXCLUDE_FROM, FROM, TO |
|
112 |
+paths=( "home-cinan-exclude" "\/home\/cinan" "home" |
|
113 |
+ "root-exclude" "\/root\/" "root" |
|
114 |
+ "empty" "\/var\/spool" "var" |
|
115 |
+ "empty" "\/var\/lib\/pacman" "var\/lib" |
|
116 |
+ "empty" "\/boot\/" "boot" |
|
117 |
+) |
|
118 |
+let "paths_peak=${#paths[@]} / 3 - 1" |
|
119 |
+ |
|
120 |
+for i in `seq 0 $paths_peak`; do |
|
121 |
+ EXCLUDE_FROM=${paths[$i*3+0]} |
|
122 |
+ FROM=${paths[$i*3+1]} |
|
123 |
+ TO=${paths[$i*3+2]} |
|
124 |
+ |
|
125 |
+ ssh backup@cinan.remote "mkdir -p incomplete_backup-$date/$TO" |
|
126 |
+ eval `echo $cmd | sed -e 's/EXCLUDE_FROM/'"$EXCLUDE_FROM"'/;s/FROM/'"$FROM"'/;s/TO/'"$TO"'/g'` 2>> /tmp/system_backup_errors |
|
127 |
+done |
|
128 |
+ |
|
129 |
+ssh backup@cinan.remote "mv incomplete_backup-$date backup-$date && rm -rf current-backup && ln -s backup-$date current-backup" |
|
130 |
+{% endraw %}{% endcodeblock %} |
|
131 |
+ |
|
132 |
+# I want my data encrypted |
|
133 |
+ |
|
134 |
+Check out 2/2 part. |