Building a higly available load balancing solution with HAProxy

When you start scaling your environment you will most likely need a load balancer but then again your load balancer will be your single point of failure which is one of the things you always want to avoid.

How do we go around that? Simply by scaling your load balancing solution as well. Most of the times in a production environment you will see load balancers in couples for redundancy. This is possible even with HAProxy using a software called keepalived.

Keepalived is not a tool specific for HAProxy but it does the job for us, since it will make it possible to share an IP address between our 2 load balancers. It does this using VRRP and you will get ownership of the IP address based on your keepalived configuration so you will end up with an active/passive architecture.

If you took the time to read the article i linked in the previous HAProxy post by Luca Dell’Oca you will know already how to build this.

First install keepalived and edit the config file:

yum install keepalived
vi /etc/keepalived/keepalived.conf

This is my config file, which you’ll notice is pretty much the same as Luca’s:

global_defs {
   notification_email {
     failover@myvirtualife.net
     sysadmin@myvirtualife.net
   }
   notification_email_from loadbalancer@myvirtualife.net
   smtp_server 192.168.100.1
   smtp_connect_timeout 30
   router_id LVS_DEVEL
}

vrrp_script chk_haproxy {
   script "killall -0 haproxy"
   interval 1                     # check every second
   Weight 2                       # add 2 points of prio if OK
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 101
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 12345678
    }

virtual_ipaddress {
    172.16.110.5
}

track_script {
33	chk_haproxy
}
}

After configuring keepalived let’s make a few more changes and then let’s check if the shared IP is active:

net.ipv4.ip_nonlocal_bind = 1
sysctl -p
service keepalived start
chkconfig keepalived on
ip addr sh eth0

keepalived

172.16.110.2 is the ip address of this load balancer.
172.16.110.5 is the shared ip address managed by keepelived.

Now you have to set up another HAProxy VM and configure it in the same way, just remember in the keepalived config file that ‘priority’ must be set to ‘100’.

To test if it works just hard power down the VM that holds the shared IP and test if communication still works.

You obviously also have to install and configure HAProxy on both VMs and remember to keep the two configurations aligned of you make any changes.

Most of the time i disable iptables but Luca does a better job than me and shows you how to configure iptables to happily get along with both keepalived and HAProxy, so if you intend to leave iptables on go check his post too.

Advertisements

Balancing multiple Horizon Workspace gateway-va with HAProxy

When working with Horizon Workspace the first component you will scale to multiple instances is probably the gateway-va since this is the access point of all users, just to make sure it’s always available for connections.

In this case you need a load balancer to direct all users to all the gateway-va you have in your environment; i wrote about commercial and open source load balancers and also how to build one with HAProxy in this post.

I’m going to show you how i configure it with Horizon Workspace but remember that since I’ve learned about HAProxy only relatively recently by Luca Dell’Oca my configuration is just the way i do it and not necessarily the best so use the comments if you want to contribute.

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------

global
log 127.0.0.1 local2 info
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
option accept-invalid-http-request
retries 3
timeout http-request 60s
timeout queue 30m
timeout connect 1800s
timeout client 30m
timeout server 30m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
listen stats :9000
stats realm Haproxy\ Statistics
stats uri /stats

#---------------------------------------------------------------------
# Redirect to secured
#---------------------------------------------------------------------
frontend unsecured
bind :80
redirect scheme https if !{ ssl_fc }

#---------------------------------------------------------------------
# frontend secured
#---------------------------------------------------------------------
frontend front
bind :443 ssl crt /etc/haproxy/reverseproxy.pem
mode http

acl workspace hdr_beg(host) -i workspace.myvirtualife.net
use_backend workspace if workspace

#---------------------------------------------------------------------
# balancing between the various backends
#---------------------------------------------------------------------
backend workspace
mode http
server workspace1 192.168.110.10:443 weight 1 check port 443 inter 2000 rise 2 fall 5 ssl
server workspace2 192.168.110.11:443 weight 1 check port 443 inter 2000 rise 2 fall 5 ssl

Try to add a gateway-va and experiment with HAProxy to test HAProxy as load balancer. You can use this article if you want to know how to do it.

There are few more things worth of noting:

  • timeouts are really long here otherwise users will experience disconnects because this is the kind of web app you keep open quite a lot;
  • on port 9000 on the HAProxy host you will find statistics, for example “lb.yourcompany.yourdomain:9000/stats”, that will give numbers about state of connections and state of backends, problems, etc…
  • “log 127.0.0.1 local2 info” is necessary if you want logging enabled which is so important when troubleshooting problems; a lot on how to read logs in the HAProxy documentation

if you intend to put a SSL cert like in my configuration, know that it has to be a chain of cert and private key like this:

-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----
-----BEGIN RSA PRIVATE KEY-----
-----END RSA PRIVATE KEY-----

To make logging work and write to a separate file instead of putting everything in “/var/log/messages”, edit your “/etc/rsyslog.conf” file and make sure these lines are present:

# Provides UDP syslog reception
$ModLoad imudp
$UDPServerRun 514

# HAProxy
local2.* /var/log/haproxy.log

How to build a load balancer with HAProxy

If you’ve been reading my previous articles you must have noticed that in Horizon Workspace there is often the hidden assumption that you need and/or you already have in place a load balancer.

Load balancers are usually appliances sold in hardware that are put in front of your workloads to distribute load to multiple backend machines delivering the same services. The reason why you want to do that is to provide performance and availability to your service as it grows.

Horizon Workspace is no difference and since it’s pretty easy to have multiple gateway-va for redundancy and scalability then you are going to need a load balancer.

I don’t want to get much into details about how many vendors are out there and what is good and bad about them, nor what I see in production environments; what I am going to say is that:

  • load balancers can be an expensive combination of hardware and software;
  • nowadays they do a whole bunch of things besides just load balancing connections, like SSL offloading, caching, content inspection, etc.
  • since virtualization has become so mainstream we now have load balancers solutions all in software coming as virtual appliances

Some time ago I just happened to bump into a nice blog post by Luca Dell’Oca about a piece of software called HAProxy.

HAProxy is a opensource software that does HTTP/TCP load balancing with a lot of nice features including for example SSL Offloading; also HAProxy seems to be used in production in very large environments with no problems at all. Check their website for reference.

At the time I was looking for a way to load balance a VMware View environment and after reading Luca’s post about how to do it with HAProxy I became a real funboy. If a customer has no load balancing solution or needs to load balance only a small subset of services I always go with HAProxy now because I found it to be very reliable and it delivers great performance consuming very little resources. What can you ask for more?

The documentation is pretty broad and precise which is always good when it comes to learn your way through things.

Enough with evangelizing HAProxy, I will just get down to business and show you how I build my load balancers.

First let’s clear out some goals and assumptions:

  • I like to use CentOS to do this but it’s not mandatory
  • I’m a big fun of RPMs but i prefer to build HAProxy from source code
  • in this post i will provide with a basic installation just to start-up
  • in future posts i will publish specific configs i use for Horizon Workspace and about how to deploy more than one HAProxy virtual appliance for redundancy
  • by no means this is the best way to do it, it’s just what i do
  • by no means I’m discouraging you from buying commercial load balancers; always remember you are the only support for solutions you build!

What I do is downloading a CentOS iso for minimal install, it’s good for this task and it’s a small download. Pick x86 or x64. Whatever. Just install it as you normally would, connect it to the internet and install VMware Tools as well.

For this tutorial I used the latest CentOS which at the time of writing is 6.4.

After getting a ‘root’ prompt this is what I do:

yum install wget openssl-devel pcre-devel make gcc -y     # this installs prerequisites
wget http://haproxy.1wt.eu/download/1.5/src/devel/haproxy-1.5-dev19.tar.gz     # download the package
tar xzvf haproxy-1.5-dev19.tar.gz     # extracting
cd haproxy-1.5-dev19     # enter the extracted directory
make TARGET=linux2628 CPU=i686 USE_OPENSSL=1 USE_ZLIB=1 USE_PCRE=1     # i compile it with compression and ssl support; use CPU=x86_64 for CentOS x64
make install     # install
cp /usr/local/sbin/haproxy* /usr/sbin/     # copy binaries to /usr/sbin
cp /root/haproxy-1.5-dev19/examples/haproxy.init /etc/init.d/haproxy     # copy init script in /etc/init.d
chmod 755 /etc/init.d/haproxy     # setting permission on init script
mkdir /etc/haproxy     # creating directory where the config file must reside
cp /root/haproxy-1.5-dev19/examples/examples.cfg /etc/haproxy/haproxy.cfg     # copy example config file
mkdir /var/lib/haproxy     # create directory for stats file
touch /var/lib/haproxy/stats     # creating stats file
useradd haproxy     # i like to make haproxy run with a specific user
service haproxy check     # checking configuration file is valid
service haproxy start     # starting haproxy to verify it is working
chkconfig haproxy on     # setting haproxy to start with VM

The main reason why I like to build HAProxy myself is that when I was learning about it I had troubles to make SSL offloading work even if I was sure I was configuring it right. Turns out most RPMs out there are built without SSL support so I started just building it up by myself. In this way I can always use the last version and even if the current latest is a development version I can tell you it’s pretty stable.

Don’t forget to disable all unneeded services/daemons; most of them are not needed to run a load balancer.

If you intend to leave the firewall on, go check Luca’s post which will give you a good insight about how to configure iptables to work with HAProxy.

Don’t bother disabling SELinux, it seems to go by with HAProxy pretty well.

Have fun with your new shiny (and free) load balancer.

Understanding Horizon Workspace components and installation prerequisites

In the last post i described in details how to prepare a vPostgres DB to host Horizon Workspace external database.

During the installation process, as we will see, you can choose to use an internal database or an external one but keep in mind that the internal database is ment only for testing purpose so if you are installing Horizon Workspace in a production environment you must have a VM with vPostgres installed as this is the only supported configuration, so you can understand why the first post was needed.

So now we are ready to install Horizon Workspace… well, not quite yet. It is very important to understand that to install this product there are number of preparation steps that need to be taken before actually getting our hands dirty and start having fun. Some of those steps include filling up some technical prerequisites and some are just decisions that need to be taken keeping in mind that during the installation phase there are some settings that cannot be changed afterwards unless redeploying the entire solution. This is something you definitely don’t want to find out after you’ve performed all the installation and configuration tasks and then have to start over again.

In this post we are going through all the prerequisites so with that out of the way we will be able to easily proceed with the deployment phase, but first let’s talk about the Horizon Workspace virtual appliances and their respective functions. The following is taken from the official documentation.

  • VMware Horizon Workspace Configurator Virtual Appliance (configurator-va): You start configuring Horizon Workspace with this virtual appliance, using both the Configurator virtual appliance interface and the Configurator Web interface. The configurations you make with the Configurator are distributed to the other virtual appliances in the vApp. Note: The configurator-va is the only component that cannot scale to multiple instances.
  • VMware Horizon Workspace Manager Virtual Appliance (service-va): Horizon Workspace Manager handles ThinApp package synchronization and gives you access to the Administrator Web interface, from which you can manage users, groups, and resources.
  • VMware Horizon Workspace Connector Virtual Appliance (connector-va): Horizon Workspace Connector provides the following services: user authentication (identity provider), directory synchronization, ThinApp-catalog loading, and View pool synchronization.
  • VMware Horizon Workspace Data Virtual Appliance (data-va): Horizon Workspace Data Virtual Appliance controls the file storage and sharing service, stores users’ data (files), and synchronizes users’ data across multiple devices.
  • VMware Horizon Workspace Gateway Virtual Appliance (gateway-va): Horizon Workspace Gateway Virtual Appliance is the single endpoint for all end-user communication. User requests come to the gateway-va virtual machine, which then routes the request to the appropriate virtual appliance.

System and Network Configuration Requirements
The preparation part is the longest and most important when deploying a distributed service such as Horizon Workspace, for this reason VMware prepared a detailed checklist to fill up before starting the installation process. The following is a list of all the things you will have to decide and mark down:

  • Horizon Workspace Fully Qualified Domain Name (FQDN)
  • Network Information for Configurator (configurator-va)
  • Network Information for Manager (service-va)
  • Network Information for Connector (connector-va)
  • Network Information for Data (data-va)
  • Network Information for Gateway (gateway-va)
  • Network Information for IP Pools
  • Active Directory Domain Controller
  • SMTP Server
  • vCenter Credentials
  • SSL Certificate (Optional)
  • Horizon Workspace License Key
  • Microsoft Windows Preview
  • External Database

Before getting into details let’s take a high level look at the architecture of Horizon Workspace as it’s meant to be in a production environment:

Image

This picture (which is taken straight from the public documentation of the product) shows that every connection from users accessing the Horizon Workspace portal have to go through the Horizon gateway VM(s). The “(s)” easily shows how you can have one or multiple Horizon gateways, in which case you will also need some sort of load balancing mechanism in front of the gateways. The Horizon gateway virtual appliance runs nginx as web server that basically proxies every connection to the desired service so users actually need connectivity only to the gateways virtual appliances.

IMPORTANT: Placing the gateway VA in a separate network such as a DMZ network is not a supported configuration.

The following picture gives a better understanding of the network configuration requirements:

Image

As you can see all communication go into the gateway VA and out to the other virtual appliances which are actually providing the services. Users will connect exclusively in HTTPS and the same is true also for most of communication between virtual appliances, so we will need to work a bit on SSL certificates at some point but it’s not mandatory in the setup phase as you can see form the above list since it is marked as optional in the prereqs.

Horizon Workspace FQDN
Choosing the FQDN is a tricky one because once you input it during the setup you can’t go back and change it, so it definitely deserves some thinking or you might find yourself redeploying from scratch. Most companies choose to have the same FQDN for both internal and external connections which makes it perfectly transparent for users to reach Workspace no matter where they are located; obviously the FQDN will resolve with a public IP for external users and with a private IP for internal users, hence the need of two sets of load balancers as you can see in the first picture.

Network configuration for virtual appliances
Just write down TCP/IP configurations that you intend to assign to the five virtual appliances, including DNS configuration. I encourage you to use consecutive addresses for simplicity.

IP Pools
Honestly this is a little obscure to me. IP Pools are used as a set of IP addresses that you define and assign to a network in vCenter so that they can be used when you deploy a vApp. Funny is the fact that those addresses must not be the ones you will use for setting up the virtual appliances. Even funnier is the fact that if you deploy the vApp from the Web Client you don’t even have to create an IP Pool. I have no problems admitting my ignorance here on the usefulness and meaning of this step.

Active Directory Domain Controller
Self explaining. Since Horizon Workspace integrates with your Active Directory you will need to have IP address, basic parameters and credentials handy during the setup. Just keep in mind that your users in AD will need to have Name, Last Name and email address compiled before importing them in Horizon Workspace or the import will fail.

SMTP Server
This is used by users when sharing documents. Note that you must specify a working SMTP since a check is performed during the setup and you won’t be able to proceed otherwise.

vCenter credentials
If you are deploying Horizon Workspace I’m pretty sure you have these. 🙂

SSL Certificate (optional)
I like to deal with this after the initial deployment and this is another tricky one, so during the setup we will use default self-signed certificates for simplicity.

Horizon Workspace Product Key
Yes, you need one. 🙂
For a proof-of-concept you can request a trial key that will work for 100 users.

Microsoft Windows Preview
When using Microsoft documents in Horizon Workspace web portal you can get a preview without having Microsoft Office installed. The preview can be generated with a LibreOffice add-on that runs directly on the data-va or they can be generated on a Microsoft Server with Microsoft Office installed; the first is a free option and it’s usually good enough, the latest will grant you a higher level of compatibility but you will have to pay Microsoft licenses.

External Database
If you read my last post you should know about this already.

Now that you have all handy you are ready to install Horizon Workspace.

%d bloggers like this: