Balancing multiple Horizon Workspace gateway-va with HAProxy

When working with Horizon Workspace the first component you will scale to multiple instances is probably the gateway-va since this is the access point of all users, just to make sure it’s always available for connections.

In this case you need a load balancer to direct all users to all the gateway-va you have in your environment; i wrote about commercial and open source load balancers and also how to build one with HAProxy in this post.

I’m going to show you how i configure it with Horizon Workspace but remember that since I’ve learned about HAProxy only relatively recently by Luca Dell’Oca my configuration is just the way i do it and not necessarily the best so use the comments if you want to contribute.

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------

global
log 127.0.0.1 local2 info
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
option accept-invalid-http-request
retries 3
timeout http-request 60s
timeout queue 30m
timeout connect 1800s
timeout client 30m
timeout server 30m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
listen stats :9000
stats realm Haproxy\ Statistics
stats uri /stats

#---------------------------------------------------------------------
# Redirect to secured
#---------------------------------------------------------------------
frontend unsecured
bind :80
redirect scheme https if !{ ssl_fc }

#---------------------------------------------------------------------
# frontend secured
#---------------------------------------------------------------------
frontend front
bind :443 ssl crt /etc/haproxy/reverseproxy.pem
mode http

acl workspace hdr_beg(host) -i workspace.myvirtualife.net
use_backend workspace if workspace

#---------------------------------------------------------------------
# balancing between the various backends
#---------------------------------------------------------------------
backend workspace
mode http
server workspace1 192.168.110.10:443 weight 1 check port 443 inter 2000 rise 2 fall 5 ssl
server workspace2 192.168.110.11:443 weight 1 check port 443 inter 2000 rise 2 fall 5 ssl

Try to add a gateway-va and experiment with HAProxy to test HAProxy as load balancer. You can use this article if you want to know how to do it.

There are few more things worth of noting:

  • timeouts are really long here otherwise users will experience disconnects because this is the kind of web app you keep open quite a lot;
  • on port 9000 on the HAProxy host you will find statistics, for example “lb.yourcompany.yourdomain:9000/stats”, that will give numbers about state of connections and state of backends, problems, etc…
  • “log 127.0.0.1 local2 info” is necessary if you want logging enabled which is so important when troubleshooting problems; a lot on how to read logs in the HAProxy documentation

if you intend to put a SSL cert like in my configuration, know that it has to be a chain of cert and private key like this:

-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----
-----BEGIN RSA PRIVATE KEY-----
-----END RSA PRIVATE KEY-----

To make logging work and write to a separate file instead of putting everything in “/var/log/messages”, edit your “/etc/rsyslog.conf” file and make sure these lines are present:

# Provides UDP syslog reception
$ModLoad imudp
$UDPServerRun 514

# HAProxy
local2.* /var/log/haproxy.log
Advertisements

How to build a load balancer with HAProxy

If you’ve been reading my previous articles you must have noticed that in Horizon Workspace there is often the hidden assumption that you need and/or you already have in place a load balancer.

Load balancers are usually appliances sold in hardware that are put in front of your workloads to distribute load to multiple backend machines delivering the same services. The reason why you want to do that is to provide performance and availability to your service as it grows.

Horizon Workspace is no difference and since it’s pretty easy to have multiple gateway-va for redundancy and scalability then you are going to need a load balancer.

I don’t want to get much into details about how many vendors are out there and what is good and bad about them, nor what I see in production environments; what I am going to say is that:

  • load balancers can be an expensive combination of hardware and software;
  • nowadays they do a whole bunch of things besides just load balancing connections, like SSL offloading, caching, content inspection, etc.
  • since virtualization has become so mainstream we now have load balancers solutions all in software coming as virtual appliances

Some time ago I just happened to bump into a nice blog post by Luca Dell’Oca about a piece of software called HAProxy.

HAProxy is a opensource software that does HTTP/TCP load balancing with a lot of nice features including for example SSL Offloading; also HAProxy seems to be used in production in very large environments with no problems at all. Check their website for reference.

At the time I was looking for a way to load balance a VMware View environment and after reading Luca’s post about how to do it with HAProxy I became a real funboy. If a customer has no load balancing solution or needs to load balance only a small subset of services I always go with HAProxy now because I found it to be very reliable and it delivers great performance consuming very little resources. What can you ask for more?

The documentation is pretty broad and precise which is always good when it comes to learn your way through things.

Enough with evangelizing HAProxy, I will just get down to business and show you how I build my load balancers.

First let’s clear out some goals and assumptions:

  • I like to use CentOS to do this but it’s not mandatory
  • I’m a big fun of RPMs but i prefer to build HAProxy from source code
  • in this post i will provide with a basic installation just to start-up
  • in future posts i will publish specific configs i use for Horizon Workspace and about how to deploy more than one HAProxy virtual appliance for redundancy
  • by no means this is the best way to do it, it’s just what i do
  • by no means I’m discouraging you from buying commercial load balancers; always remember you are the only support for solutions you build!

What I do is downloading a CentOS iso for minimal install, it’s good for this task and it’s a small download. Pick x86 or x64. Whatever. Just install it as you normally would, connect it to the internet and install VMware Tools as well.

For this tutorial I used the latest CentOS which at the time of writing is 6.4.

After getting a ‘root’ prompt this is what I do:

yum install wget openssl-devel pcre-devel make gcc -y     # this installs prerequisites
wget http://haproxy.1wt.eu/download/1.5/src/devel/haproxy-1.5-dev19.tar.gz     # download the package
tar xzvf haproxy-1.5-dev19.tar.gz     # extracting
cd haproxy-1.5-dev19     # enter the extracted directory
make TARGET=linux2628 CPU=i686 USE_OPENSSL=1 USE_ZLIB=1 USE_PCRE=1     # i compile it with compression and ssl support; use CPU=x86_64 for CentOS x64
make install     # install
cp /usr/local/sbin/haproxy* /usr/sbin/     # copy binaries to /usr/sbin
cp /root/haproxy-1.5-dev19/examples/haproxy.init /etc/init.d/haproxy     # copy init script in /etc/init.d
chmod 755 /etc/init.d/haproxy     # setting permission on init script
mkdir /etc/haproxy     # creating directory where the config file must reside
cp /root/haproxy-1.5-dev19/examples/examples.cfg /etc/haproxy/haproxy.cfg     # copy example config file
mkdir /var/lib/haproxy     # create directory for stats file
touch /var/lib/haproxy/stats     # creating stats file
useradd haproxy     # i like to make haproxy run with a specific user
service haproxy check     # checking configuration file is valid
service haproxy start     # starting haproxy to verify it is working
chkconfig haproxy on     # setting haproxy to start with VM

The main reason why I like to build HAProxy myself is that when I was learning about it I had troubles to make SSL offloading work even if I was sure I was configuring it right. Turns out most RPMs out there are built without SSL support so I started just building it up by myself. In this way I can always use the last version and even if the current latest is a development version I can tell you it’s pretty stable.

Don’t forget to disable all unneeded services/daemons; most of them are not needed to run a load balancer.

If you intend to leave the firewall on, go check Luca’s post which will give you a good insight about how to configure iptables to work with HAProxy.

Don’t bother disabling SELinux, it seems to go by with HAProxy pretty well.

Have fun with your new shiny (and free) load balancer.

Configuring redundancy for Horizon Workspace Virtual Machines aka How To Scale Horizon Workspace

Horizon Workspace can scale to many thousands of users, but obviously you are going to need more than just the mere default setup with 5 virtual machines if you want to get there.

As an example let’s take VMware own internal implementation for 13.000+ users so we can see how does Horizon Workspace scale:

  • 1x Configurator VA is used. 2vCPU, 2G Memory
  • 6x Connector VA is used. 2 vCPU, 4G Memory
  • 4x Gateway VA is used: 2 vCPU, 8G Memory
  • 2x Service VA is used: 2vCPU, 6G Memory (1 for HA)
  • 11x Data VA is used: 6 vCPU, 32G Memory
  • 2x Postgres Server is used: 4 vCPU, 4G Memory (1 for replication)
  • 3x MS Office Preview Server: 4vCPU, 4G Memory

VMware Architectural Diagram

As you can see most components can scale to many units, except the configurator-va. The configurator-va is a single point of administration when it comes to configuring your Horizon Workspace environment and it cannot be redundant.

Note: If you intend to increase the capacity of your Horizon Workspace virtual machines don’t forget to adjust the java heap size for improved performance.

In order to add a new virtual machine of any type, you must log in to the configurator-va virtual machine as root user and run the following command:

hznAdminTool addvm –type="VMType" --ip="new VM ip address"


This command can be executed only after the Horizon Workspace setup has been fully completed and you have tested that the solution is working.

The new virtual machines will have to follow the same requirements regarding IP addresses as the base virtual machines. For an overview of these requirements check “How to install Horizon Workspace using an external database”.

For Connector and Data virtual machines, this command creates the new virtual machine by cloning a base snapshot of the original virtual machine of the same type. The base snapshot is captured for all virtual machines during the initial deployment. The command fails if the base snapshot does not exist.

For service and gateway virtual machines, this command creates the new virtual machine by cloning the current virtual machine snapshot.

Let’s dig into details about having multiple instance of each type of virtual machine.

Note: The following commands, unless specified otherwise, must be executed on the configurator-va.

Multiple gateway-va
Companies can deploy multiple gateway-va in order to distribute load on more than one virtual machine thus providing both redundancy and scalability for this role. This is usually the first role that you want to make redundant since it’s the entry point for all users.

The specific command to add a gateway-va is as follows:

hznAdminTool addvm –type=GATEWAY --ip="new VM ip address"


Multiple service-va
You might want to add another service-va for the same reasons of the gateway-va.

Note: In order to add more service-va you must be using and external database.

The specific command to add a service-va is as follows:

hznAdminTool addvm –type=APPLICATION_MANAGER -- ip="new VM ip address"


Now connect to https://ConfiguratorHostname, open the System Information page and note how both the old and new service-va are listed and also how the new service-va is in maintenance mode. Before proceeding verify that the virtual machine was added correctly by checking the IP address.

We are going to need to open some firewall ports on all service-va, as referral for the coming configs use these:

iptables -A INPUT -i eth0 -s "OTHER_service_va_IP" -p tcp --dport
9300:9400 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -o eth0 -s "OTHER_service_va_IP" -p tcp --sport
9300:9400 -m state --state ESTABLISHED -j ACCEPT
iptables -A INPUT -i eth0 -s "OTHER_service_va_IP" -p udp --dport
54328 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -o eth0 -s "OTHER_service_va_IP" -p udp --sport
54328 -m state --state ESTABLISHED -j ACCEPT

Now we need to do the following to open firewall ports:

  • Run hznAdminTool listvms command to list service-va virtual machines.
  • Write down only the service-va virtual machine IP addresses.
  • Log in to the service-va virtual machine for IP address1 as root and go to the console.
  • Run the iptables command and use IP address2 as the value for the “OTHER_service_va_IP” parameter.
  • Log in to the service-va virtual machine for IP address2 as root and go to the console.
  • Run the iptables command and use IP address1 as the value for the “OTHER_service_va_IP” parameter value.

Next we need to run the following commands on all service-va:

service elasticsearch stop
hznAdminTool configureElasticSearch -ES_MULTICAST_ENABLED true service elasticsearch start
service elasticsearch status


And run the following commands only on the new service-va:

service rabbitmq-server stop
service elasticsearch stop
rm /var/run/rabbitmq/pid
rm /var/run/rabbitmq/lock
rm /var/run/elasticsearch/elasticsearch.pid
rm /var/lock/subsys/elasticsearch
rm -R /db/rabbitmq/data/*
rm -R /db/elasticsearch/*
service rabbitmq-server start
service rabbitmq-server status
rabbitmqctl stop_app
rabbitmqctl force_reset
rabbitmqctl start_app
hznAdminTool configureElasticSearch -ES_MULTICAST_ENABLED true service elasticsearch start
service elasticsearch status

Finally go to https://ConfiguratorHostname/cfg and click “Exit Maintenance Mode” on the newly added service-va. The Configurator updates all the gateway-va virtual machines and starts sending requests to the new
service-va virtual machine as well.

Multiple connector-va
Creating multiple connector-va will allow you to reduce traffic and reduce downtime. Other than that, creating multiple connector-va will enable you to use multiple means of authentication such as Active Directory user and password, RSA SecurID passcode or Kerberos-based Windows authentication. To enable multiple forms of authentication, you must set up multiple connector-va virtual machines.

Image

Depending on the type of authentication, you deploy a new connector-va in a different way. This subject will require a new post by itself but you can find details now in the Horizon Workspace Documentation Center.

Multiple data-va
User accounts are provisioned to a specific data-va virtual machine that handles their file activity. It is recommended that each data-va virtual machine serve no more than 1000 users, so you need to scale if you have more than that. When you add a new data-va virtual machine, the new data-va virtual machine automatically becomes available from the default COS host pool. The host pool for other classes of service that are created displays the new data-va virtual machine, but it is not enabled in that COS. To use a new data-va virtual machine in the other classes of service, the administrator must modify the COS and enable the data-va virtual machine.

The first data-va virtual machine in the Horizon Workspace configuration is the master node. This node contains the metadata for the data-va virtual machine user accounts. If you create additional data-va virtual machines, these data-va virtual machines are file stores only. When the master node is down, users cannot log in to their data accounts.

You can configure the host pool in the COS to use specific data-va virtual machines. In this way, you can manage where accounts are provisioned. For example, you add a second data-va virtual machine because disk space on the first data-va virtual machine is low. You do not want the first data-va virtual machine to be provisioned with any more new accounts once you have added the second node. From the Horizon Workspace Administrator Web interface, edit each COS to select the new data-va virtual machine in the Host Pool and deselect the other data-va virtual machine.

The specific command to add a data-va is as follows:

hznAdminTool addvm –type=DATA --ip="new VM ip address"


Note: Don’t forget to configure preview on each data-va.

Now the new data-va is in maintenance mode, to complete adding a new data-va do the following:

  • Restart each existing data-va
  • Log in to each data-va virtual machine as the root user to generate ssh keys
  • Reboot each data-va

The on each data-va:

su - zimbra
/opt/zimbra/.ssh/authorized_keys/zmupdateauthkeys
/etc/rc.d/memcached restart

Now go to https://ConfiguratorHostname and click “Exit Maintenance Mode”.

The new data-va virtual machine is ready to use.

Update: In Horizon Workspace 1.5 the base snapshot of the data-va is not used anymore to create other data-va. In order to create more data-va you have to create a “New datava-template Virtual Machine”.

Disclaimer: In this article i pasted parts of the official documentation.

%d bloggers like this: