Swift and Object Storage
Swift is Openstack‘s Object Storage project.
At a very high level, I like to present Object Storage as a filesystem accessible through a set of APIs, often directly by HTTP.
Object Storage backends are usually built from the ground up to be resilient, failure tolerant, highly available and provide mechanisms to ensure data redundancy and security.
Object Storage is the secret sauce that hides behind interfaces such as Dropbox, Google Drive or Microsoft OneDrive.
Ceph is another open source project with Object Storage at it’s core. Ceph natively provides a way to create, mount and format block devices out of the box - Swift, however, does not.
This is great and I’m a fan of Ceph myself but what if there was a way to mount a Swift object store as a filesytem ?
Let’s take a closer look at S3QL which allows you to do just that.
If you don’t have an object storage service, I’ve written a basic guide on finding the right provider.
What’s S3QL ?
Straight from S3QL’s repository:
S3QL is a file system that stores all its data online using storage services like Google Storage, Amazon S3, or OpenStack. S3QL effectively provides a hard disk of dynamic, infinite capacity that can be accessed from any computer with internet access running Linux, FreeBSD or OS-X.
S3QL is a standard conforming, full featured UNIX file system that is conceptually indistinguishable from any local file system. Furthermore, S3QL has additional features like compression, encryption, data de-duplication, immutable trees and snapshotting which make it especially suitable for online backup and archival.
Sounds pretty neat, right ? Let’s see if we can get it installed and test it out.
Installing S3QL
Thankfully, S3QL is packaged for most distributions because it is otherwise a bit more complex to install.
I’ll be testing this under Ubuntu 14.04 (Trusty) so it’s really as simple as adding the PPA and installing S3QL:
add-apt-repository ppa:nikratio/s3ql
apt-get update && apt-get -y install s3ql
Setting it up
Authentication
First, create the file in which you’ll store authentication details:
mkdir ~/.s3ql && install -b -m 600 /dev/null ~/.s3ql/authinfo2
Open up ~/.s3ql/authinfo2
with your favorite editor and fill in your Swift authentication information in the following format:
[swift]
backend-login: tenant:username
backend-password: password
storage-url: swiftks://keystone.example.org/<region>:<container>
It was not necessary for me to specify whether or not the keystone is HTTPs (port 443) and also the (/v2.0|/v2.0/tokens) portion of the authentication URL.
The <region>
parameter is something that depends on your provider and where your data is located.
The <container>
is the bucket in which the filesystem will reside. It needs to be created first, s3ql won’t do it for you.
Creating the Swift container for s3ql
To create an empty container with swiftclient, you can use swift post <container>
:
# Create an empty container called "s3ql" if it doesn't exist
$ swift post s3ql
$ swift stat s3ql
Account: AUTH_e8217e83ef32427bb4e4d217f1390ab4
Container: s3ql
Objects: 0
Bytes: 0
Read ACL:
Write ACL:
Sync To:
Sync Key:
Accept-Ranges: bytes
Server: nginx
Connection: keep-alive
X-Timestamp: 1412040138.20837
X-Trans-Id: tx3e99ab36fdf44fa3b197f-00542a07bc
Content-Type: text/html; charset=UTF-8
Initializing the filesystem
To initialize the filesystem, use mkfs.s3ql
with your storage url as argument:
$ mkfs.s3ql swiftks://keystone.example.org/region:s3ql
Before using S3QL, make sure to read the user's guide, especially
the 'Important Rules to Avoid Loosing Data' section.
Enter encryption password:
Confirm encryption password:
Generating random encryption key...
Creating metadata tables...
Dumping metadata...
..objects..
..blocks..
..inodes..
..inode_blocks..
..symlink_targets..
..names..
..contents..
..ext_attributes..
Compressing and uploading metadata...
Wrote 155 bytes of compressed metadata.
We can tell that s3ql created some objects:
$ swift list s3ql
s3ql_metadata
s3ql_passphrase
s3ql_passphrase_bak1
s3ql_passphrase_bak2
s3ql_passphrase_bak3
s3ql_seq_no_1
$ swift stat s3ql
Account: AUTH_e8217e83ef32427bb4e4d217f1390ab4
Container: s3ql
Objects: 6
Bytes: 791
Read ACL:
Write ACL:
Sync To:
Sync Key:
Accept-Ranges: bytes
Server: nginx
Connection: keep-alive
X-Timestamp: 1412040138.20837
X-Trans-Id: txddafca8cbda244f5b5f1e-00542a0944
Content-Type: text/plain; charset=utf-8
Mounting the filesystem
s3ql provides the mount.s3ql
utility that’s pretty straightforward:
$ mkdir /mnt/s3ql
$ mount.s3ql swiftks://keystone.example.org/region:s3ql /mnt/s3ql/
Using 4 upload threads.
Autodetected 4052 file descriptors available for cache entries
Enter file system encryption passphrase:
Using cached metadata.
Setting cache size to 37584 MB
Mounting filesystem...
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 50G 1.1G 46G 3% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
udev 991M 12K 991M 1% /dev
tmpfs 201M 352K 200M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 1001M 0 1001M 0% /run/shm
none 100M 0 100M 0% /run/user
swiftks://keystone.example.org/region:s3ql 1.0T 0 1.0T 0% /mnt/s3ql
$ ls -al /mnt/s3ql/
total 0
drwx------ 1 root root 0 Sep 29 21:32 lost+found
Cool.
Trying it out
$ dd if=/dev/zero of=/mnt/s3ql/bigfile.bin bs=1024k count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 80.1757 s, 13.4 MB/s
So, pretty much as fast as my upload speed will go ?
What does it look like on Swift’s side ?
$ swift list s3ql --lh
1.7K 2014-09-30 01:47:25 s3ql_data_1
792 2014-09-30 01:48:45 s3ql_data_2
155 2014-09-30 01:32:34 s3ql_metadata
132 2014-09-30 01:32:34 s3ql_passphrase
132 2014-09-30 01:32:34 s3ql_passphrase_bak1
132 2014-09-30 01:32:34 s3ql_passphrase_bak2
132 2014-09-30 01:32:34 s3ql_passphrase_bak3
108 2014-09-30 01:32:35 s3ql_seq_no_1
108 2014-09-30 01:39:42 s3ql_seq_no_2
3.3K
Oh, what’s this. Where’s my 1GB file ? Nothing’s being uploaded in the background.
Unmounting and re-mounting the filesystem, still nothing in Swift.. but my file is there, so it has to be stored somewhere, right ?
$ ls -alh /mnt/s3ql/
total 1.0G
-rw-r--r-- 1 root root 1.0G Sep 29 21:48 bigfile.bin
drwx------ 1 root root 0 Sep 29 21:32 lost+found
Pleasant surprises
That’s right, s3ql provides data compression and de-duplication out of the box. I have a 1GB file but it’s all zeroes so it compresses really well.
Let’s download an ISO file and see what happens:
$ wget http://releases.ubuntu.com/14.04.1/ubuntu-14.04.1-server-amd64.iso -O /mnt/s3ql/ubuntu-14.04.1-server-amd64.iso
Once the download completed, I could see the space started being used up slowly until most of the space taken by the ISO was there:
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 50G 1.7G 46G 4% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
udev 991M 12K 991M 1% /dev
tmpfs 201M 352K 200M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 1001M 0 1001M 0% /run/shm
none 100M 0 100M 0% /run/user
swiftks://keystone.example.org/region:s3ql 1.0T 194M 1.0T 1% /mnt/s3ql
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 50G 1.7G 46G 4% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
udev 991M 12K 991M 1% /dev
tmpfs 201M 352K 200M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 1001M 0 1001M 0% /run/shm
none 100M 0 100M 0% /run/user
swiftks://keystone.example.org/region:s3ql 1.0T 586M 1.0T 1% /mnt/s3ql
And then querying Swift I could finally see some objects:
$ swift stat s3ql
Account: AUTH_e8217e83ef32427bb4e4d217f1390ab4
Container: s3ql
Objects: 72
Bytes: 582715279
Read ACL:
Write ACL:
Sync To:
Sync Key:
Accept-Ranges: bytes
Server: nginx
Connection: keep-alive
X-Timestamp: 1412040138.20837
X-Trans-Id: tx26300dd0f2bf411da158a-00542a10bb
Content-Type: text/plain; charset=utf-8
$ swift list --lh s3ql
1.7K 2014-09-30 01:47:25 s3ql_data_1
10.0M 2014-09-30 02:01:09 s3ql_data_10
9.3M 2014-09-30 02:01:23 s3ql_data_11
10.0M 2014-09-30 02:01:29 s3ql_data_12
[...]
9.9M 2014-09-30 02:01:06 s3ql_data_8
9.3M 2014-09-30 02:01:00 s3ql_data_9
375 2014-09-30 01:54:18 s3ql_metadata
155 2014-09-30 01:54:17 s3ql_metadata_bak_0
155 2014-09-30 01:54:17 s3ql_metadata_bak_0_tmp$oentuhuo23986konteuh1062$
375 2014-09-30 01:54:15 s3ql_metadata_new
375 2014-09-30 01:54:17 s3ql_metadata_tmp$oentuhuo23986konteuh1062$
132 2014-09-30 01:32:34 s3ql_passphrase
132 2014-09-30 01:32:34 s3ql_passphrase_bak1
132 2014-09-30 01:32:34 s3ql_passphrase_bak2
132 2014-09-30 01:32:34 s3ql_passphrase_bak3
108 2014-09-30 01:32:35 s3ql_seq_no_1
108 2014-09-30 01:39:42 s3ql_seq_no_2
108 2014-09-30 01:54:59 s3ql_seq_no_3
555M
In conclusion
It works, and it works pretty well from what I’ve seen so far.
I’ve recently written a post on how you could encrypt your backups and send them to a Swift object storage with duplicity.
s3ql also encrypts your data with a passphrase, preventing a third party to peek at your data.
I still think duplicity is great since it really keeps track of your backups and is very efficient in doing so.
I don’t have a great use case that comes to mind right now for s3ql but I’m sure users will find out as object storage becomes cheaper and cheaper.
Share this post