Mounting Amazon Cloud Drive and Uploading Encrypted Data

12 Aug

I have about 5tb worth of files I need to backup online. With 5tb of data, solutions like Dropbox, Box.com, Google Drive, Microsoft OneDrive, etc. aren’t viable.  Technically Google Drive you could pay $10/tb so $50/month would get me 5tb of storage.  That’s still a lot just for backups.

There are “commercial” or “enterprise” solutions like Amazon S3 or Glacier:

– Standard S3 storage that’d cost $148 a month
– Standard – S3 Infrequent Access Storage that’d cost $62.50 a month  (wouldn’t work too well for me since I’d be accessing it frequently)
Glacier Storage that’d cost $35 a month (but I’d have difficulty accessing my data quickly/on-demand)

I know, I’m cheap, but then I discovered Amazon Cloud Drive, $60/year for “unlimited” storage. That’s essentially $5 a month, way cheaper than any of the above solutions, and you can easily access your files whenever you want.

There are a few caveats though, cloud drive does not offer standard protocols like WebDav/FTP/SFTP, they use some proprietary protocol but it is web driven. Luckily the development community has come up with solutions, reverse engineering the Amazon Cloud Drive protocol and making it where you can access. A friend of mine uses NetDrive with Windows to mount his Cloud Drive up to a drive letter. But I’m a Linux guy, so in steps acd_cli (aka acdcli).

ACDCLI

acd_cli.py technially, but it’s a python library/class that allows you to interact with Amazon Cloud drive using the “acdcli” python script. Essentially you can do things like “acdcli ls” (to list files) and “acdcli upload” (to upload a file) and “acd download” (to download a file) and “acd rm” (to delete a file)

More importantly, is the “acdcli mount” command, which allows you to mount your Amazon Cloud Drive to a path (like /amazon_cloud_drive). This uses FUSE, which is a pseudo-filesystem, it’s kind of like a regular linux mount but it’s done at the software layer rather than deeper in the kernel. This roughly translates to it not being as reliable/stable/fast as a regular mount, but it works. You won’t be able to use it like a “normal” mount/filesystem, don’t go expecting to do file/group ownership, permissions, symbolic links, etc. it’s limited, mostly it’s good for basic file storage/retrieval.

https://github.com/yadayada/acd_cli Read more about it.

It’s fine though, all I intend to store on my Amazon Cloud Drive is photos, videos, backup files, etc. I don’t plan on putting anything advanced on it or interacting with it in any advanced capacity.

With acdcli you will first need to authenticate it to your Amazon account. You could run “acdcli init” but that kind of assumes it can launch a browser and point you at a specific URL. That URL turns out not to be so “lynx” friendly, I ended up copying/pasting that URL into a real browser on my computer. Essentially you authorized acdcli to access your amazon account and you will get a “oauth_data” file. Copy that file to your linux server into ~/.cache/acd_cli/oauth_data

Then run “acdcli sync”  this sync’s acdcli with Amazon Cloud Drive. acdcli kind of keeps a local listing of all the files that are on your cloud drive. I’m not sure why, maybe it’s a speed thing, so that it doesn’t have to talk to Amazon Cloud Drive every single time you do an “ls” or something. But just do it.

Then run “mkdir ~/amazon_cloud_drive”  followed by “acdcli mount -ao ~/amazon_cloud_drive”  and bingo. Now if you look in ~/amazon_cloud_drive/  you’ll see your files. Easy right?

ENCFS

The issue I had with Amazon Cloud Drive is when I read through their privacy policies, or their terms of service policy or whatever they call it. In the middle of it, hidden in legalize, is wording that let me to believe Amazon support engineers could look into your data. While I don’t have anything to hide, I also don’t want to expose all my personal files to just anybody. I’d rather Amazon not be able to see my data, so… encryption.  I discovered “encfs” thanks to another guide/article, encfs  is also a FUSE type of mount but it basically encrypts/decrypts on the fly from one directory to another directory.  You essentially point encfs to a “source” directory, where the encrypted files will be stored, and to a “destination” directory where you will work with those files like normal.

First just run “encfs” to setup a key and a passphrase for the key. You could technically not put a passphrase into the key since we won’t be uploading that key to the cloud drive, but I prefer to passphrase protect it just in case. Once you have the key, which is an xml file called encfs.xml,  you can do something like “export ENCFS6_CONFIG=’~/encfs.xml'”  or whatever.  Once you do that you can work with encfs more easily.

Now run “mkdir ~/acd” followed by “encfs ~/amazon_cloud_drive ~/acd” enter your passphrase and done!  Now you can place whatever files you want into ~/acd/  and it’ll be stored to your Amazon Cloud Drive encrypted.

EXAMPLE

So below you’ll see in “/amazon_cloud_drive” there are the Documents/Pictures/Videos folders Amazon automatically created for us but there are 2 folders that have strange characters, even the folders/filenames are encrypted.  When I look in /acd  I will see that those 2 encrypted folders are backups/ and photos/  etc. etc. it makes sense if you look at it/think about it. The bottom line is, it works. Amazon has no idea what those files or folders are, safe from prying eyes. Again, I have nothing to hide, it’s the principle of it all that gets to me. They don’t need to see photos of my children, or backups of a computer I had 3 years ago.

# acdcli mount /amazon_cloud_drive
# encfs mount /amazon_cloud_drive /acd
Password: ***********************

# ls /amazon_cloud_drive
Documents/
Pictures/
SoDvpAGY4PDKDLszKyLiG-3A/
Videos/
WkL1wU6ZrKhDuGnQWakyZE2q/

# ls /amazon_cloud_drive/SoDvpAGY4PDKDLszKyLiG-3A
08dKwABwrt8jqeJgHys/
NvZd KedFXEq5dj9HcP2f6zaD4nXI/
0d8tnRvFHXMegGZsgyTWxODy

# ls /acd
backups/
photos/

# ls /acd/photos
Family2010/
Family2011/
myavatar.jpg

SPEED

So what kind of speed am I getting? Well… right now I have 2 servers pushing my files to the Cloud Drive. One server is on a 1gbps connection (it’s in Germany and tends to have speed and packet loss issues) and it’s consistently uploading at around 30mbps. My other server is uploading to the Cloud Drive, at the same time, using a 100mbps connection (but in the U.S) and it’s able to consistently upload around 45mbps. So it’s pretty good in general. It is taking several weeks to upload 5 Gigabytes but I don’t mind. I’m using rsync to sync the data and afterwards I’ll have a cronjob run rsync once a day to sync only the changes.

For download speed, I’m getting close to double the upload speed. 60mbps from Germany and 90mbps from the U.S server. That 90mbps may be able to go faster but remember, I’m only on a 100/100 connection. So… not bad! Technically it’s fast enough to stream video off of. Theoretically you could put Plex on your server, point it to the ~/acd/ folder to where you personal family videos and photos are, and then you could access those personal photos/videos of your family from anywhere.

CAVEATS

So it’s not all sunshine and rainbows. More than once my uploads have frozen/stopped. I’ve had to ctrl+c the rsync process, “umount -l” the /acd and /amazon_cloud_drive folders and kill any left over rsync/acdcli/rsync processes. Kind of a pain.  I’ve then had to run “acdcli sync” and then mount the folders up again and start the rsync process.

Also one time I was getting some error codes from Amazon’s servers when I did the “acdcli sync” command, some searching seemed to indicate that it’s a known issue and potentially something on Amazon’s side is blocking your authentication. The solution was simply to delete your oauth_data file and re-authorized acdcli using your browser and putting the new oauth_data file into place. I’ve had to do that once, but ever since I did that things do seem to be working a bit better. It could be just because I’m uploading so much data, relentlessly, consistently pushing data file after file after file. Maybe it’ll be more stable after all my files are up on the Amazon Cloud Drive. I imagine I’ll still need to update the oauth_data file once every month or two.

I really don’t know how Amazon is going to react to 5tb of storage, however I’ve read in forums about several other users who’ve stored 5tb, 10tb, 50tb and even 100tb of data on their cloud drive and haven’t gotten any complaints from Amazon. That’s insane, 100tb? Even I’d feel guilty about that. That 100tb was from a user was complaining the acdcli was limited to 100tb of storage, turns out he had close to 100tb of files stored on his cloud drive for many months. Wow.   So hopefully my measly 5tb won’t raise an eyebrow. But my general advice to anybody attempting the above setup, backup everything. Use Amazon Cloud Drive as a backup location, don’t keep your “only copy” of a file on Amazon Cloud Drive. According to their AUP (Acceptable Use Policies) is that they do reserve the right to cancel the account of any abusers. Remember, there’s no such thing as truly “unlimited”.  Also the fact that the files are encrypted may throw up some red flags, everyone’s natural reaction is “what are you hiding?”.  I suggest using a combination, anything non-important or generic, store in your “amazon_cloud_drive/” folder unencrypted. Anything you prefer to keep private, for whatever reasons, store in “acd/” which would be protected by encfs.  Also, DO NOT LOSE your encfs xml file, that contains the key to encrypt/decrypt your data. If you lose, your data is as good as useless.

One Response

Leave a Reply

Deon's Playground

Placing whatever interests me and more