This is part two of a series about encrypted file storage/archive systems. My plan is to try out duplicity, git using transparent encryption, s3-based storage systems, git-annex and encfs+sshfs as alternatives to Dropbox/Wuala/Spideroak. The conclusion will be a blog post containing a comparison a.k.a. “executive summary” of my findings. Stay tuned.
Duplicity is a command-line tool similar to rsync: you give it two locations and it synchronizes the first location to the second. Duplicity adds additional features over rsync, especially interesting for me are incremental encrypted backups to remote locations. This form of storage would prevent any hoster of gaining any information about my stored data or its metadata (like filenames, etc.).
Duplicity supports multiple storage backends, the most interesting for me were Amazon S3 and SSH/SFTP. All my examples will use the SFTP backend as I tend to have SSH servers laying around.
Using Duplicity
The best way of evaluating a tool, is using it:
Setup (on Debian/Ubuntu)
First of all, install the needed software packages
|
|
(python-paramiko
was needed on Ubuntu 13.04 as the Ubuntu package seems to have broken dependencies).
Use SSH key-based authentication scheme for communicating with the backup storage server (which will be named backupserver
in my examples):
|
|
The created backup files will be signed and encrypted with a GPG key, so if you don’t have one, create one:
|
|
You can use an existing key (that you are using for email communication) or create a wholy new one which does not have any linkage to your email address. Just make sure that you use a large enough key size.
Do the Backup
With all that information we can now create the initial backup (I’m using 12345678
as my key id):
|
|
This will backup the directory “stuff-to-backup
” onto the storage server in a directory at “/home/andy/remote-backup
”. The inital backup will take longer as all data is transfered, subsequent backups will only be created as differential backup (ie. only changed data is transfered).
Query Backup Information
Sometimes you want to check the contents of your remote backups, to do this you can do:
|
|
Restore Files
Time to get our files back:
|
|
You can use the -t
parameter to restore older versions of files, i.e. -t 3D
restores the backup as it was three days ago.
Maintainence for your Backup
Backups will grow over time and your online storage might be limited. Thus you might need periodic maintenance for keeping your online storage needs low:
|
|
Duplicity does incremental backups: the remove commands will make sure that no backup set that is needed for restoring some later backup will be removed.
Common Key Management Operations
All backups will be signed and encrypted with your private GPG key. You will loose all data if you loose this key, so better keep care of it!
Let’s export the key into a textfile (see this page for further information). Keep those file in a secure place:
|
|
Summary
If you know rsync duplicity is very easy to use. It’s easy operation and setup makes it perfect for creating secure backups on the fly, but it is not well suited for synchronizing team data (as there is no conflict management, the later update overwrites conflicting states).
If this is to command line-y for you, there’s also Deja-Dup, a graphical user interface for duplicity.