We recently started using Amazon's Elastic Load Balancing (ELB) service for our webapp (which is hosted in AWS). We're pretty happy with it so far; although we initially tried using Cherokee as our web server, that didn't seem to work quite right with ELB: About every few hours or so, we'd get a blank page back from the load balancer. I suspect it's a version of a similar issue between ELB and Apache that Amazon fixed back in May 2010. We switched over to using Apache for our web server, and that seemed to fix it, so we didn't dig any further.
One other little disappointing thing about ELB is that it can't serve "naked" domains (ie
example.com instead of
www.example.com), so we still have a single EC2 instance with a fixed ip address to which we point the DNS record for our naked domain name (and which our web server redirects to our "www" subdomain). Shlomo Swidler has a real nice blog post on why this is, and generally how ELB works (and how to test it).
The AWS docs already have a pretty quick, concise set of steps to setup an ELB for your web servers, so here's a little bit of a higher-level overview of what you have to do:
The only thing that's set in stone when you create the balancer is its name, which is the id you use to reference it, and also becomes the first segment of its dns name; and the region it's in. However, when you create the balancer, you also have to specify at least one port to listen on and at least one availability-zones among which to balance (although you can change both any time you want).
A load balancer can balance among all the availability zones of the region in which you create it. You do have to specifically enable each zone, however. If you don't register with the balancer any of your app's instances from any one particular availability-zone, the balancer will just ignore that zone (ie it won't try to balance to empty zones). You'll want to make sure you have an equivalent number of instances in each zone that you have any instances in, though, since the balancer doesn't adjust its balancing policy to account for the number of instances you have registered in a zone — it will just assume that it can balance to each zone equally.
If you're serving straight HTTP, you'll want to configure the balancer's listener to use the
HTTP protocol (the other option is
TCP, for everything else). You can configure the balancer to listen on one port and route to another port, if you want (like to listen on port 80, but route to port 8080 on your app instances).
Once you've create the balancer, the next step is to configure it with the URL of your app that the balancer should ping for each instance, to check if the instance is still healthy. You can actually specify a different port for the ping URL, so you could theoretically have the ELB ping a completely different application than the app is balancing (it just has to be some process listening on your instance). But you probably just want it to be some URL in your app that serves a stripped-down (or even empty) HTML page.
Also note that, as long as you specify the
HTTP protocol for the ping URL, the ping URL must return an HTTP response status of
200 for the balancer to consider it to be OK. Anything else — including a
30x redirect to some other page — and the balancer will consider it to be unhealthy.
And when you configure the ping URL, you also configure the interval between pings (in seconds), and the max number of seconds in which your server must respond before the balancer times-out the ping. Also, you must set a threshold for the number of good pings before the instance is added to the group balanced by the ELB, and the number of bad pings before it is removed; AWS requires these threshold values to be at least 2. A pretty standard setting for this is to make the interval 30 seconds, the timeout 10 seconds, and the healthy and unhealthy thresholds 2 — so if an instance starts going sideways, the ELB will continue to balance to it for at most a minute and a half before it recognizes that it's gone bad.
3 Configure Sticky Sessions (or Not)
If you need the balancer to do "sticky" sessions (balance to the same instance over the life of your application's sesssion — for example because you have per-session state that's not shared among instances), you need to configure the ELB to use sticky sessions keyed off your application's session cookie. For example, if you're using a standard java servlet engine (like tomcat or jetty), you'd configure the ELB to use the
JSESSIONID cookie to route all requests for the same session to the same instance. And note that with the ELB API, this is a two step process — the first step is to create a stickyness "policy", and the second step is to apply it (individually to each listener that needs to use the policy).
If you don't need sticky sessions, then you don't need to do anything special (ELB will do round-robin by default).
The last step to get going is simply to add your app instances to the load-balancer. The balancer will balance only among instances in the availability zones you specified when you created the balancer (unless/until you separately enable/disable those zones). Note that you should add an equivalent number of instances to each zone to which you've added any instances, as the ELB will try to balance equally among all zones in which you have healthy instances.
The real last step, of course, is to point actual live traffic to the ELB. AWS gives you a unique (and unpredictable) DNS name for the balancer (like
DescribeLoadBalancers API response gives you it). The DNS name never changes over the life of the instance, though. So to direct your traffic through the ELB, you create/update CNAME records for all the subdomains you want the balancer to serve (ie
www.example.org, etc), pointing each record to the balancer's AWS DNS name. As the DNS update propagates across the tubes, your users will start using the ELB.