Amazon S3 and Google Cloud Storage like commercial cloud storage service with affordable prices provide high availability, scalability, unlimited capacity object storage services. To accelerate the widespread adoption of these cloud offerings, these providers for their products through a clear API and SDK cultivate a good developer ecosystem. The cloud-based file system is the active developer community typical products, already had several open source implementations.
S3QL is the most popular open source cloud file system. It is a FUSE-based file system that provides several commercial or open source cloud storage back-end, such as Amazon S3, Google Cloud Storage, Rackspace CloudFiles, as well as OpenStack. As a fully functional file system, S3QL has many powerful features: Maximum file size 2T, compression, UNIX attributes, encryption, replication-based write snapshots, immutable tree, deduplication, and soft and hard links support and more. S3QL write any file system data will be the first local compression, encryption, only after transfer to the back-end cloud. When you try to remove content from S3QL file system, if they are not in the local cache, the corresponding object will be downloaded from the cloud back then instantly decrypt, decompress.
To be clear, S3QL does have its limitations. For example, you can not put the same S3FS file system while mounted on several different computers, only one computer to access it simultaneously. In addition, ACL (Access Control List) are not supported.
In this tutorial, I will describe "How to use Amazon S3-based configuration S3QL an encrypted file system." As an example of use, I will explain how to run rsync backup tool on the mounted file system S3QL.
Ready to work
This tutorial you first need to create an Amazon AWS account (registration is free, but requires a valid credit card).
Then create an AWS access key (access key ID and secret access key), S3QL use this information to access your AWS account.
After the visit by AWS Management panel AWS S3, and create a new, empty bucket is S3QL.
For best performance considerations, please select your nearest region a geographically.
S3QL installed on Linux
S3QL have precompiled packages for most Linux distributions.
For Debian, Ubuntu or Linux Mint:
$ Sudo apt-get install s3ql
$ Sudo yum install s3ql
For Arch Linux, use AUR.
First Time S3QL
Create autoinfo2 file ~ / .s3ql directory, it is a default configuration file S3QL of. This document must include information about the AWS access key, S3 bucket name, and encrypted password. This encrypted password is used to encrypt a randomly generated master key, and master key will be used to actually encrypt S3QL file system data.
$ Mkdir ~ / .s3ql
$ Vi ~ / .s3ql / authinfo2
storage-url: s3: // [bucket-name]
Designated AWS S3 bucket need to pre-created by the AWS Management panel.
For security reasons, only you can make authinfo2 file access.
$ Chmod 600 ~ / .s3ql / authinfo2
Create a file system S3QL
Now you're ready to create a file system on S3QL AWS S3.
Mkfs.s3ql use tool to create a new file system S3QL. This command in the bucket name should match authinfo2 file specified. Use "--ssl" parameter will force the use of SSL to connect to the back-end storage server. By default, mkfs.s3ql command to enable compression and encryption S3QL file system.
$ Mkfs.s3ql s3: // [bucket-name] --ssl
You will be asked to enter an encrypted password. Please enter your through "fs-passphrase" to specify that the password in ~ / .s3ql / autoinfo2 in.
If a new file system is successfully created, you will see this output:
File system mount S3QL
When you create a S3QL file system, the next step is to mount it.
First, create a local mount point, and then use the command to mount S3QL mount.s3ql file system.
$ Mkdir ~ / mnt_s3ql
$ Mount.s3ql s3: // [bucket-name] ~ / mnt_s3ql
S3QL mount a file system does not require a privileged user, just make sure you have the write permissions to the mount point.
Optionally, you can use the "--compress" parameter to specify a compression algorithm (such as lzma, bzip2, zlib). In case you do not specify, lzma will be used by default. Note that if you specify a custom compression algorithm, it will only apply to the data newly created object, and does not affect the existing data object.
$ Mount.s3ql --compress bzip2 s3: // [bucket-name] ~ / mnt_s3ql
For performance reasons, S3QL file system maintains a local file cache, which includes the recent visit of (some or all) files. You can "--cachesize" and "--max-cache-entries" option to customize the file cache size.
If you want users other than you have access to a S3QL mounted file system, use the "--allow-other" option.
If you want to export through NFS file system is mounted S3QL to other machines, use the "--nfs" option.
After running mount.s3ql, check S3QL file system has been successfully mounted up:
$ Df ~ / mnt_s3ql
$ Mount | grep s3ql
Uninstall S3QL file system
Want to safely unload a (may contain uncommitted data) S3QL file system, use umount.s3ql command. It will wait for all the data (including the local file system cache section) successfully transferred to the back-end server. Depending on how many wait for the write data, this process may take some time.
$ Umount.s3ql ~ / mnt_s3ql
See S3QL file system statistics and file system repair S3QL
To see S3QL file system statistics, you can use s3qlstat command, it will show the total data such as the size of the metadata, delete duplicate files and compression rates and other information.
$ S3qlstat ~ / mnt_s3ql
You can use the command to check and repair fsck.s3ql S3QL file system. And fsck command is similar to the file system to be examined must first be uninstalled.
$ Fsck.s3ql s3: // [bucket-name]
S3QL Use case: Rsync backup
Let me use a popular use case to end this tutorial: the local file system backup. For this reason, I recommend using rsync incremental backup tool, especially since S3QL rsync package provides a script (/usr/lib/s3ql/pcp.py). This script allows you to use multiple rsync process recursively copy the directory tree to S3QL target.
$ /usr/lib/s3ql/pcp.py -h
This command will use four concurrent connections rsync backup ~ / Documents Lane S3QL everything to a file system.
$ /usr/lib/s3ql/pcp.py -a --quiet --processes = 4 ~ / Documents ~ / mnt_s3ql
These files will first be copied to a local file cache, then gradually synchronized to the back-end server in the background.