GlusterFS


There is a lot of hype about “software defined storage/servers/datacenters” these days from a variety of vendors.   There have been a recent plethora of established and upstart storage vendors selling their own vision of solutions and it has really given me a headache looking at all the different options.

So what exactly is software defined storage?

Well, I think it depends on who you ask.  What it means to me though is a solution that grows horizontally via independent storage devices that act as one.   There are lots of interesting players out there right now that provide access to the storage via FC, NFS, iSCSI, and via special clients.   One such solution which is implemented as both an open source and commercial software only option is GlusterFS, also know as Red Hat Storage Server in the commercial space.

GlusterFS is of particular interest to me for a variety of reasons.  This isn’t to say that GlusterFS is inherently better than other options out there, and I hope to share some of my experience reviewing other storage products of the some category.  First of all, GlusterFS is a software only solution.  Red Hat, who bought Gluster Inc. several years ago, continues to offer an open source version as well as its own implementation Red Hat Storage Server.  Whats the difference?  Using the open source version, you miss out on Red Hat’s implementation of the XFS filesystem for >16TB volume support (on a per storage node basis), an easy to use management GUI, and paid support.

GlusterFS is a file based NAS solution.  Servers running GlusterFS server provide “bricks” to build the storage capacity.  Data is written from a server running GlusterFS client and distributed among the bricks based on how the volume was configured.  GlusterFS servers don’t talk directly with each other to store data.  Instead, the client handles all the logic.  So as long as your client is up, you don’t have to worry about controller failure in the same way you would in a traditional FC or NAS array.   All nodes in a Gluster cluster are completely independent controllers.    Typically you’d use 1 or more replicas to provide redundancy across controllers, and provide more read copies for faster access speed.  You can also stripe data across nodes, but as of right now it doesn’t appear you can using both striping and replicas at the same time.   When I get the opportunity to build a larger test cluster in the near future, I hope to play with that combination to see what actually works or not.

GlusterFS (and Red Hat Storage Server) are marketed as a cost effective alternative to the traditional storage market.   My own calculations show that you can scale up for about $1/GB using commodity servers.  In my calculations I used a popular 3rd party server vendor’s 2U server with 24 1TB 7200RPM 2.5″ SAS drives and dual 10GbE NIC’s, and accounted for 1 replica (effectively mirroring each file between two nodes).  On paper at least, this storage node could handle 1800-2400 IOPS, not counting local RAID overhead.  A very real scenario of a 1PB useable filesystem would cost you around $1.2M and would consist of 94 nodes with a back end capability of upwards of 150k-200k IOPS after RAID.

There are several downsides to consider when using GlusterFS or Red Hat Storage.

  1. That $1.2M doesn’t include connectivity to the servers.   10GbE port costs are less expensive than 8Gb FC though, so this may be a wash if you are building from scratch.
  2. Your storage is only as fast as your client and its network interface.  Since you’re shifting your controller functionality onto the client, you’ll use more bandwidth to get to the Gluster nodes when using 1 or more replicas.
  3. Generally speaking erformance is only going to be seen in scale, not on an individual client.   You can improve this effect using faster drives such as 10k RPM SAS or SSD but the cost will go up with it, or by switching to distributed stripes which you lose data availability in the event of a node failure.

Overall though, for certain applications I think GlusterFS is still a very valid solution that should considered.

  1. Distributed storage in a cloud computing environment.
  2. HPC clusters where you can stripe working data across nodes without long term retention requirements.
  3. Streaming media repository.
  4. Storage of nearline data for backups.

http://www.gluster.org/

http://www.redhat.com/products/storage-server/