Thursday, April 21, 2011

Overview of GoGrid and Rackspace Load Balancing Services

Overview

Load Balancing is a technology to spread workload between several computers. These days one of the most popular application of this technology is load balancing for Web sites, such as balancing HTTP/HTTPS traffic across several Web servers (such as Apache httpd).

A lot of web deployments are moving to the cloud, so load balancers do. I'll give an overview of Load Balancing services provided by Rackspace and GoGrid.

GoGrid

Load Balancer service is a part of GoGrid's Cloud offering and was there from the version 1.0 of the API, so it's about 2008.

GoGrid uses F5 hardware Load Balancers, as stated in the documentation.

Rackspace

Rackspace's Load Balancer service is a separate stand alone service as opposite to GoGrid's one.

The service is relatively young: private beta was announced in November, 2010 and the final release happened in April, 2011, just few days ago at time of writing. Generally, I have started using it from the first private beta and it became quite stable already in the beginning of 2011.

Rackspace offering is based on Zeus software.

API

Like a theatre begins with hanger, services begins with API. Let's overview what API allows us to do with Load Balancers.

GoGrid

As it was mentioned above, Load Balancer API is just a subset of GoGrid API with all its pros and cons. It provides CRUD (create, read, update, delete) operations support for load balancers.

Actually, it's not very close to REST concept as:

* It doesn't use HTTP any method but GET, so e.g. to add a new balancer you make GET request on URL like 'loadbalancer/add' instead of POST on 'loadbalancer' and to delete it you call GET on 'loadbalancer/delete' instead of DELETE, etc
* It doesn't have a concept of element URIs, only collections. So, to get details on a balancer you request 'loadbalancers/get?id=bal_id' instead of 'loadbalancers/bal_id/'
* It uses GET arguments for passing options instead of serialised object in request body

It's not good or bad that it doesn't follow REST closely, REST is not a standard after all, and actually the fact that one doesn't need to bother with HTTP methods other than GET and serialisation might be beneficial for somebody. I will provide some analysis as API usability from programmers point of view later in the post.

So, what is it all about? You create a balancer and you have to specify IPs and ports of nodes to share the load. Important thing to know that you have to use IP addresses assigned to your GoGrid account, otherwise it will fail with not very descriptive internal error.

Besides specifying IP list, you can tweak some load balancer options, such as type and persistence. Currently GoGrid supports two types of load balancing:

* Round Robin (default)
* Least Connect

And for persistence, the options are:

* No persistence (default)
* SSL Sticky
* By source address

Once a balancer has been created, you can change only list of IPs and ports it balances load for. One caveat is that you have to pass a complete list of IPs, not just incrementally add or remove them one by one, so be careful not to run into race condition.

With GoGrid you can have up to 6 load balancers per account and up to 3 load balancers for a data center.

Rackspace

Rackspace Load Balancer API is, obviously, centered around CRUD operations for load balancer objects as well, though it supports far more than that.

It seems to follow REST quite precisely: uses collection and elements URI, correct HTTP semantics and content types.

When creating a balancer, you can specify not only IPs that belong to your Rackspace account, but basically any IP you want. Actually, I think it makes a lot of sense to provide such a flexibility if you're running a hybrid setup, and, say, have part of the nodes in your data center and part of them running on the cloud.

Additionally, you can tweak quite a lot of options. Let's check details for the most important ones, such as balancing type and persistence. As for balancing type, it provides a wide range of options:

* Round Robin
* Least Connect
* Random
* Weighted Least Connect
* Weighted Round Robin

For the last two options node weight (which you can assign to node) is taken into account. Type has to be specified at creation time, so no default value marked.

As for session persistence, only HTTP cookie persistence is supported.

Programming Specifics

There is a specific thing about building something on top of load balancing API. The thing is that unlike dealing with cloud servers, where you work mostly with atomic objects (like server itself), with load balancers you have a collection of nodes you want to balance between. And here two race conditions possible.

The first one is caused by the fact that load balancer needs to be reconfigured after you add or remove a node. And when you try to modify nodes list during the process of re-configuration, you'll get an error. Moreover, GoGrid doesn't have a special state which says that balancer is being re-configured at the moment. Rackspace has such a state, but race condition is still possible, image the following scenario:

* Balancer B is in 're-configuring' state
* Apps A1 and A2 want to add a new node to B
* Apps A1 and A2 see that B is immutable and start polling its status while it's not 'Ready'
* A1 becomes first
* A2 fails

This is not very serious problem, but it make coding a little more complicated.

The second possible race condition is GoGrid implementation specific because of its model of keeping IP list as a whole, without support of adding/removing individual IPs. Imagine a slightly modified version of the previous scenario:

* Balancer B is in 're-configuring' state
* App A1 wants to add IP I1 to B, so the IP list would be [B.ips] + [I1]
* App A2 wants to add IP I2 to B, so the IP list would be [B.ips] + [I2]
* Apps A1 and A2 tries to make a request until it succeeds
* App A1 becomes first, B.ips becomes [B.old_ips] + [I1]
* App A2 still fails because B turns 're-configuring' again
* App A2 finally succeeds with it's request and updates list of Ips to [B.old_ips] + [I2]
* As a result, I1 that should have been added is actually missing

Again, it's not like this is not solvable problem, but it's quite an effort to solve it.

And finally some numbers. I took Rackspace and GoGrid Load Balancer drivers from libcloud trunk which implement the same common interface and gathered these metrics:

* Driver's lines of code (loc)
* Unit-tests' lines of code (test loc)
* Lines of fixtures (i.e. content of responses)



You know that there are lies, damned lies, and statistics, so it's up to you to analyse these numbers. :-)

5 comments:

  1. Hi Roman,
    Great post. I agree that having to manually serialize the requests against the load balancer api is less than ideal. We're working on improving this aspect of the service so that we queue up requests and process them serially.

    Josh Odom
    Product Line Leader, Platform
    Rackspace Hosting

    ReplyDelete
  2. Josh, glad to hear you're working on that! I think it will make life easier for people who're using service actively and issue a lot of requests in short periods of time.

    ReplyDelete
  3. Wow what a great post.I like it.Can you more share at here.I have many ideas about it.I will come back as soon.


    Thanks for more sharing...........



    Foot Care NYC

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete