Cloud Native Applications and VMware

After quite a bit of radio silence I’m going to write about Cloud Native Applications and VMware approach to those.
After spending some time looking into container technologies with open source software it’s nice to see that VMware is jumping on the boat by adding their enterprise vision which is probably the missing part compared to other solutions.
I will start by preparing a template for all the services that I will install and I will do it the VMware way by using PhotonOS which I intend to use as proof of concept for vSphere Integrated Containers (VIC), Photon Controller, Harbor and Admiral.
PhotonOS is a lightweight operating system written just for running containerized applications and such; I have to say that after getting familiar with it I quite like its simplicity and quick approach to all day to day activities.
First thing first, you have to choose your deployment type, there are a few:

screen-shot-2016-12-15-at-13-46-27

I won’t describe the process as it’s pretty straightforward, I’ll just say that I manually installed PhotonOS with the ISO choosing the Minimal install option.

After installing we need the IP address and we also need to enable root to ssh into the box:

ip add     # show ip address info
vi /etc/ssh/sshd_config     # PermitRootLogin = yes
systemctl restart sshd     # restart ssh deamon

Then ssh as root and continue:

mkdir .ssh
echo "your_key" >> .ssh/authorized_keys
tdnf check-update
open-vm-tools.x86_64 10.0.5-12.ph1 photon-updates
nss.x86_64 3.25-1.ph1 photon-updates
shadow.x86_64 4.2.1-8.ph1 photon-updates
linux.x86_64 4.4.8-8.ph1 photon-updates
python-xml.x86_64 2.7.11-5.ph1 photon-updates
docker.x86_64 1.11.2-1.ph1 photon-updates
systemd.x86_64 228-25.ph1 photon-updates
python2-libs.x86_64 2.7.11-5.ph1 photon-updates
python2.x86_64 2.7.11-5.ph1 photon-updates
procps-ng.x86_64 3.3.11-3.ph1 photon-updates
filesystem.x86_64 1.0-8.ph1 photon-updates
openssl.x86_64 1.0.2h-3.ph1 photon-updates
systemd.x86_64 228-26.ph1 photon-updates
systemd.x86_64 228-30.ph1 photon-updates
python2-libs.x86_64 2.7.11-7.ph1 photon-updates
python-xml.x86_64 2.7.11-7.ph1 photon-updates
python2.x86_64 2.7.11-7.ph1 photon-updates
curl.x86_64 7.47.1-3.ph1 photon-updates
pcre.x86_64 8.39-1.ph1 photon-updates
openssl.x86_64 1.0.2h-5.ph1 photon-updates
openssh.x86_64 7.1p2-4.ph1 photon-updates
openssl.x86_64 1.0.2j-1.ph1 photon-updates
iptables.x86_64 1.6.0-5.ph1 photon-updates
systemd.x86_64 228-31.ph1 photon-updates
initramfs.x86_64 1.0-4.1146888.ph1 photon-updates
glibc.x86_64 2.22-9.ph1 photon-updates
open-vm-tools.x86_64 10.0.5-13.ph1 photon-updates
rpm.x86_64 4.11.2-11.ph1 photon-updates
linux.x86_64 4.4.26-1.ph1 photon-updates
initramfs.x86_64 1.0-5.11330561.ph1 photon-updates
python2.x86_64 2.7.11-8.ph1 photon-updates
curl.x86_64 7.47.1-4.ph1 photon-updates
bzip2.x86_64 1.0.6-6.ph1 photon-updates
tzdata.noarch 2016h-1.ph1 photon-updates
expat.x86_64 2.2.0-1.ph1 photon-updates
python2-libs.x86_64 2.7.11-8.ph1 photon-updates
python-xml.x86_64 2.7.11-8.ph1 photon-updates
docker.x86_64 1.12.1-1.ph1 photon-updates
cloud-init.x86_64 0.7.6-12.ph1 photon-updates
bridge-utils.x86_64 1.5-3.ph1 photon-updates
linux.x86_64 4.4.31-2.ph1 photon-updates
systemd.x86_64 228-32.ph1 photon-updates
curl.x86_64 7.51.0-1.ph1 photon-updates
initramfs.x86_64 1.0-5.11343362.ph1 photon-updates
cloud-init.x86_64 0.7.6-13.ph1 photon-updates
open-vm-tools.x86_64 10.1.0-1.ph1 photon-updates
initramfs.x86_64 1.0-5.11353601.ph1 photon-updates
cloud-init.x86_64 0.7.6-14.ph1 photon-updates
vim.x86_64 7.4-6.ph1 photon-updates
linux.x86_64 4.4.35-1.ph1 photon-updates
libtasn1.x86_64 4.7-3.ph1 photon-updates
tdnf upgrade -y
Upgrading:
vim x86_64 7.4-6.ph1 1.93 M
tzdata noarch 2016h-1.ph1 1.52 M
systemd x86_64 228-32.ph1 28.92 M
shadow x86_64 4.2.1-8.ph1 3.85 M
rpm x86_64 4.11.2-11.ph1 4.28 M
python2 x86_64 2.7.11-8.ph1 1.82 M
python2-libs x86_64 2.7.11-8.ph1 15.30 M
python-xml x86_64 2.7.11-8.ph1 318.67 k
procps-ng x86_64 3.3.11-3.ph1 1.04 M
pcre x86_64 8.39-1.ph1 960.35 k
openssl x86_64 1.0.2j-1.ph1 5.23 M
openssh x86_64 7.1p2-4.ph1 4.23 M
open-vm-tools x86_64 10.1.0-1.ph1 2.45 M
nss x86_64 3.25-1.ph1 3.87 M
libtasn1 x86_64 4.7-3.ph1 161.48 k
iptables x86_64 1.6.0-5.ph1 1.46 M
linux x86_64 4.4.35-1.ph1 44.76 M
initramfs x86_64 1.0-5.11353601.ph1 11.49 M
glibc x86_64 2.22-9.ph1 50.97 M
filesystem x86_64 1.0-8.ph1 7.14 k
expat x86_64 2.2.0-1.ph1 242.58 k
docker x86_64 1.12.1-1.ph1 82.59 M
curl x86_64 7.51.0-1.ph1 1.24 M
cloud-init x86_64 0.7.6-14.ph1 1.93 M
bzip2 x86_64 1.0.6-6.ph1 1.65 M
bridge-utils x86_64 1.5-3.ph1 36.61 k

Total installed size: 272.23 M

Downloading:
bridge-utils 19201 100%
bzip2 526008 100%
cloud-init 509729 100%
curl 898898 100%
docker 25657821 100%
expat 92851 100%
filesystem 16357 100%
glibc 19396323 100%
initramfs 11983289 100%
linux 18887362 100%
iptables 416848 100%
libtasn1 98060 100%
nss 1591172 100%
open-vm-tools 912998 100%
openssh 1853448 100%
openssl 3192392 100%
pcre 383441 100%
procps-ng 458368 100%
python-xml 86471 100%
python2-libs 5651168 100%
python2 741755 100%
rpm 1761294 100%
shadow 2002202 100%
systemd 11856941 100%
tzdata 633502 100%
vim 1046120 100%
Testing transaction
Running transaction
Creating ldconfig cache

Complete!

After that I rebooted since the “linux” package was updated and that stands for the kernel version.
You can check the kernel version loaded with:

uname -a

More customizations:

vi /boot/grub2/grub.cfg     # edit "set timeout=1"
iptables --list     # show iptables config which by defaults allows only SSH inbound
vi /etc/systemd/scripts/iptables     # edit iptables config file

I like to enable ICMP inbound, you can find the rule I added as the last one before the end of file:

iptables_config_file

systemctl restart iptables
iptables --list     # check running configuration includes ICMP inbound
systemctl enable docker     # enable docker loaded at boot

In coming days I will follow up with VIC, Photon Controller, Harbor and Admiral using this PhotonOS VM as template.

Dear old ESXTOP aka How to schedule ESXTOP batch mode

Recently I had to record a night activity of a specific VM running on a specific host for troubleshooting reasons because vCenter data wasn’t just enough for that.

Using a number of blog posts around from Duncan Epping and others (it was many, I don’t even have the links anymore) I’ve put up my personal guide about how to take over this task because every time it’s like I have to start from scratch so I decided to document it.

First thing I created a script with the specific run time and collection data I needed:

vi <path>/record-esxtop.sh
esxtop -b -a -d 2 -n 3600 > /esxtopoutput.csv

OR

esxtop -b -a -d 2 -n 3600 | gzip -9c > /esxtopoutput.csv.gz

(-d=sampling rate, -n=number of iterations; the total run time is “d*n” in seconds)
(second version creates a zipped version of the output)

Let’s make this script executable:

chmod +x <path>/record-esxtop.sh

Then, since in latest versions of ESXi there is no crontab, you’ll need to edit the cron file for the user you want to run the script with:

vi /var/spool/cron/crontabs/root

Then add a line similar to this:

30 4 * * * <path>/record-esxtop.sh

Now kill crond and reload:

cat /var/run/crond.pid
ps | grep 
kill -HUP 
ps | grep 
/usr/lib/vmware/busybox/bin/busybox crond
cat /var/run/crond.pid
ps | grep 

Now your script will get executed and you’ll find a file with your data, but how to read it?
It’s dead simple, just open PerfMon on Windows, clear all running counters then right-click on “Performance Monitor” and in the tab “Sources” add your CSV file (need to unpack it first); in the data tab you will then be able to choose metrics and VMs you want to add to your graph.

It would be nice to have a tool that does the same on Mac but I couldn’t find one and I had to use a Windows VM; if you know a Mac alternative for PerfMon please add a comment.

This procedure is supported by VMware as per KB 103346.

How to backup, restore and schedule vCenter Server Appliance vPostgres Database

Now that we are moving away from SQL Express in favor of vPostgres for vCenter simple install on Windows and since vPostgres is the default database engine for (not so simple) install of vCSA I thought it would be nice to learn how to backup and restore this database.

Since it’s easier to perform these tasks on Windows and since there are already many guides on the Internet I will focus on vCSA because I think that more and more production environment (small and big) will be using vCSA since now it’s just as functional as vCenter if not more. (more on this in another post…)

You will find all instructions for both Windows and vCSA versions of vCenter on KB2091961, but more important than that you will find there also the python scripts that will work all the magic for you so grab the “linux_backup_restore.zip” file and copy it to the vCSA:

scp linux_backup_restore.zip root@<vcenter>:/tmp

For the copy to work you must have previously changed the shell configuration for the root user in “/etc/passwd” from “/bin/appliancesh” to “/bin/bash”

Then:

unzip linux_backup_restore.zip
chmod +x backup_lin.py
mkdir /tmp/linux_backup_restore/backups
python /tmp/linux_backup_restore/backup_lin.py -f /tmp/linux_backup_restore/backups/VCDB.bak

All you will see when the backup is completed is:

Backup completed successfully.

You should see the backup file now:

vcenter:/tmp/linux_backup_restore/backups # ls -lha
total 912K
drwx------ 2 root root 4.0K Jun 3 19:41 .
drwx------ 3 root root 4.0K Jun 3 19:28 ..
-rw------- 1 root root 898K Jun 3 19:29 VCDB.bak

At this point I removed a folder in my vCenter VM and Templates view, then I logged off the vSphere WebClient and started a restore:

service vmware-vpxd stop
service vmware-vdcs stop
python /tmp/linux_backup_restore/restore_lin.py -f /tmp/linux_backup_restore/backups/VCDB.bak
service vmware-vpxd start
service vmware-vdcs start

I logged back in the WebClient and my folder was back, so mission accomplished.

Now how do I schedule this thing? Using the good old crontab but before that I will write a script that will run the backup and also give a name to the backup file corresponding to the weekday so I can have a rotation of 7 days:

#!/bin/bash
_dow="$(date +'%A')"
_bak="VCDB_${_dow}.bak"
python /tmp/linux_backup_restore/backup_lin.py -f /tmp/linux_backup_restore/backups/${_bak}

I saved it as “backup_vcdb” and made it executable with “chmod +x backup_vcdb”.

Now to schedule it just run “crontab -e” and enter a single line just like this:

0 23 * * * python /tmp/linux_backup_restore/backup_vcdb

This basically means that the system will execute the script every day of every week of every year at 11pm.

After the crontab job runs you should see a new backup with a name of this sort:

vcenter:/tmp/linux_backup_restore/backups # ls -lha
total 1.8M
drwx------ 2 root root 4.0K Jun 3 19:46 .
drwx------ 3 root root 4.0K Jun 3 19:28 ..
-rw------- 1 root root 898K Jun 3 19:29 VCDB.bak
-rw------- 1 root root 900K Jun 3 19:46 VCDB_Wednesday.bak

You will also have the log files of these backups in “/var/mail/root”.

Enjoy your new backup routine 🙂

Using vCSA 6.0 as a Subordinate CA of a Microsoft Root CA

One of the nicest improvements in vSphere 6 is the ability to use the VMware Certificate Authority (VMCA) as a subordinate CA.
In most cases enterprises already have some form of PKI deployed in house and very often it is Microsoft based so I will show you how I did it with a Microsoft Enterprise CA.

I give for granted that the Microsoft PKI is already in place, in my case it is a single VM with an Enterprise Microsoft CA installed.

The vCSA should also be already be in place.

As first step I edit the certool config file but first I make a backup of the default configuration:

mkdir /root/backup
cp /usr/lib/vmware-vmca/share/config/certool.cfg /root/backup
vi /usr/lib/vmware-vmca/share/config/certool.cfg

Compile the config file with the parameters that are good for your setup then save the file and exit.

Now we have to generate a certificate request for the VMCA to pass to the Microsoft CA and there are many ways to do that, I am going to use the vSphere Certificate Manager Utility that will automatically take most steps for me:

/usr/lib/vmware-vmca/bin/certificate-manager

Screen Shot 2015-03-29 at 23.20.25
At this point I have the .csr file (/root/root_signing_cert.csr) and the private key (/root/root_signing_cert.key) so let’s feed it to the Microsoft CA as you normally would for any certificate request using the “Subordinate Certification Authority” template:

Screen Shot 2015-03-29 at 23.25.43Now you have to take the crt file in base64 format on the vCSA and also the Microsoft CA root certificate in base64 format as well; copying files with SCP will be a challenge because the root user on the vCSA by default doesn’t use the bash shell so if you want to use this method you need to edit the “/etc/passwd” and set the root user to use bash as a shell and then you can put it back as it was once you are done transferring the files.

It could be just simpler to open the certs on your computer and the connect to the vCSA via SSH and copy the content inside new files; one way or another you need to take the certificates on the vCSA, in my case they are “root_signing_cert.pem” and “cam.pem”.

Now we need to combine the two files in a chain file:

cp root_signing_cert.pem caroot.pem
cat ca.pem >> caroot.pem

If you open the “caroot.pem” file you should see a single cert file with both ca and certificate one after another.

Now we can go back to the vSphere Certificate Manager Utility to apply this certificate:

Screen Shot 2015-03-29 at 23.36.16

Since we have already edited the certool.cfg file we just have to confirm the values that the wizard proposes, just remember to enter the FQDN of the vCenter server:

Screen Shot 2015-03-29 at 23.37.18
If you have a successful outcome you can connect via browser to your vSphere Web Client and check the certificate:


Screen Shot 2015-03-29 at 23.39.59

Screen Shot 2015-03-29 at 23.40.08

 

As you can see now this is a trusted connection and the VMCA has released certificates for the Solution Users on behalf of the Microsoft Root CA.

You can check the active certificate in the vSphere Web Client in the Administration section:

Screen Shot 2015-03-29 at 23.42.30

In case you decide to remove the original root certificate then you will have to refresh the Security Token Service (STS) Root Certificate, and replace the VMware Directory Service Certificate following the vSphere 6 documentation.

Now the VMCA is capable of signing certificates that are valid in you PKI chain and are trusted by default in you Windows domain by all clients.

 

How To Deploy vCSA 6.0 with a Mac

The new vCenter Server Appliance has a new deployment model, both architectural wise and installation wise.

I wrote extensively about the architectural changes in this post, so I will focus on how to deploy it with a Mac using command line tools since if you want to use the graphical setup you need to be running Windows.

In order to do this you need the ISO file of the vCSA mounted in your Mac.

In “/Volumes/VMware VCSA/vcsa-cli-installer/mac” you will find a script called “vcsa-deploy” that requires a JSON file with all the parameters needed to deploy and configure the VCSA on your host.

You can find templates of JSON files in “/Volumes/VMware VCSA/vcsa-cli-installer/templates”, here is how I compiled mine in order to obtain a single VM with all vCenter and PSC services:

{
    "__comments":
    [
        "Sample template to deploy a vCenter Server with an embedded Platform Services Controller."
    ],

    "deployment":
    {
        "esx.hostname":"192.168.1.107",
        "esx.datastore":"vsanDatastore",
        "esx.username":"root",
        "esx.password":"12345678",
        "deployment.option":"tiny",
        "deployment.network":"LAN",
        "appliance.name":"vCenter",
        "appliance.thin.disk.mode":true
    },

    "vcsa":
    {

        "system":
        {
            "root.password":"12345678",
            "ssh.enable":true
        },

        "sso":
        {
            "password":"12345678",
            "domain-name":"vsphere.local",
            "site-name":"Default-First-Site"
        },

        "networking":
        {
            "ip.family":"ipv4",
            "mode":"static",
            "ip":"192.168.110.2",
            "prefix":"24",
            "gateway":"192.168.110.254",
            "dns.servers":"8.8.8.8",
            "system.name":"192.168.110.2"
        }
    }
}

You can see how I used the newly created “vsanDatastore” as my destination datastore.

Your SSO password will be checked against complexity compliance by the script before starting the deployment process.
Passwords are stored in clear text so make sure not to leave around this file and possibly destroy it after use or change all the passwords right after deployment.
You might have noticed that as the system name I used the IP address: I had to do this because I have no DNS (yet) and if you enter a FQDN as system name you need to make sure that it can be resolved both with forward and reverse DNS calls so I had no choice; this will actually be a limitation later on because I will not be able to add the vCSA to a Windows domain so if I want to use Windows credentials to log in my vCenter I will need to setup LDAP authentication.

You just fire this command to start the deployment:

/Volumes/VMware\ VCSA/vcsa-cli-installer/mac/vcsa-deploy vcenter60.json

During the deployment process you will see the following:

Start vCSA command line installer to deploy vCSA "vCenter60", an embedded node.

Please see /var/folders/dp/xq_5cxlx2h71cgy2t83ghkd00000gn/T/vcsa-cli-installer-9wU8aB.log for logging information.

Run installer with "-v" or "--verbose" to log detailed information.

The SSO password meets the installation requirements.
Opening vCSA image: /Volumes/VMware VCSA/vcsa/vmware-vcsa
Opening VI target: vi://root@192.168.1.107:443/
Deploying to VI: vi://root@192.168.1.107:443/

Progress: 99%
Transfer Completed
Powering on VM: vCenter60

Progress: 18%
Power On Completed

Installing services...
Progress: 5%. Setting up storage
Progress: 50%. Installing RPMs
Progress: 56%. Installed oracle-instantclient11.2-odbc-11.2.0.2.0.x86_64.rpm
Progress: 62%. Installed vmware-identity-sts-6.0.0.5108-2499721.noarch.rpm
Progress: 70%. Installed VMware-Postgres-9.3.5.2-2444648.x86_64.rpm
Progress: 77%. Installed VMware-invsvc-6.0.0-2562558.x86_64.rpm
Progress: 79%. Installed VMware-vpxd-6.0.0-2559267.x86_64.rpm
Progress: 83%. Installed VMware-cloudvm-vimtop-6.0.0-2559267.x86_64.rpm
Progress: 86%. Installed VMware-sps-6.0.0-2559267.x86_64.rpm
Progress: 87%. Installed VMware-vdcs-6.0.0-2502245.x86_64.rpm
Progress: 89%. Installed vmware-vsm-6.0.0-2559267.x86_64.rpm
Progress: 95%. Configuring the machine
Service installations succeeded.

Configuring services for first time use...
Progress: 3%. Starting VMware Authentication Framework...
Progress: 11%. Starting VMware Identity Management Service...
Progress: 14%. Starting VMware Single Sign-On User Creation...
Progress: 18%. Starting VMware Component Manager...
Progress: 22%. Starting VMware License Service...
Progress: 25%. Starting VMware Service Control Agent...
Progress: 33%. Starting VMware System and Hardware Health Manager...
Progress: 44%. Starting VMware Common Logging Service...
Progress: 55%. Starting VMware Inventory Service...
Progress: 64%. Starting VMware vSphere Web Client...
Progress: 66%. Starting VMware vSphere Web Client...
Progress: 70%. Starting VMware ESX Agent Manager...
Progress: 74%. Starting VMware vSphere Auto Deploy Waiter...
Progress: 81%. Starting VMware Content Library Service...
Progress: 85%. Starting VMware vCenter Workflow Manager...
Progress: 88%. Starting VMware vService Manager...
Progress: 92%. Starting VMware Performance Charts...
Progress: 100%. Starting vsphere-client-postinstall...
First time configuration succeeded.

vCSA installer finished deploying "vCenter60", an embedded node:
System Name: 192.168.110.20
Login as: Administrator@vsphere.local

It's time to connecto the the new Web Client, just open your browser to "https://" and then select "Log In To the vSphere Web Client".

You should now log in but before starting the normal configuration process I suggest you take care of password expiration in which present in two separate areas in this version of vCSA: the SSO users and the root system user.
About the first one you can go to Administration -> Single Sign-On -> Configuration -> Password Policy and edit the Maximum Lifetime to “0” so effectively you are disabling expiration:

Featured image

For the root user you will need to drop to the vCSA command line, enable and access the Shell the issue the following:

localhost:~ # chage -l root        # show current password expiration settings

localhost:~ # chage -M -1 root     # set expiration to Never
Aging information changed.
localhost:~ # chage -l root
Minimum: 0
Maximum: -1
Warning: 7
Inactive: -1
Last Change: Mar 17, 2015
Password Expires: Never
Password Inactive: Never
Account Expires: Never

Now you could start deploying all your VMs but if you try that you will find that vSAN will complain about a policy violation!

Do you remember how we needed to change the default policy on the host before we could deploy vCSA?
We did that at the host level but when vCSA started managing the host the default policy has been overwritten to the original defaults so now we have to change it again to match our need but this time we can leverage the GUI for this task:

Featured image

Now all is set and you should be good to go… not really!
We’ve never set a network for the vSAN traffic, even if I’m running on a single node configuration this will still trigger a warning:

Featured image

All you have to do is create a new VMKernel portgroup and flag it for vSAN traffic and your will system be again a little happy vSphere host.

Running a Home Lab on a Single vSAN Node

This is how I managed to run my lab on a single vSAN node and manage it completely Windows free, which is always a goal for a Mac user like me; with vSphere 6 this is a lot easier than it used to be in the past thanks to improvement in the Web Client (and the fact that the fat client doesn’t connect to vCenter anymore) and also thanks to the new VCSA that comes with deployment tools for Mac.
About the storage side of things, I’ve always been running my lab with some kind of virtual storage appliance in the past (Nexenta, Atlantis, Datacore) but those require a lot of memory and processing power and this reduces the number of VMs I can run in my lab simultaneously.
It’s true that I can get storage acceleration like this (which is so important in a home lab) but I sacrifice consolidation ratio and add complexity to take into account when I do upgrades and maintenance, so I decided to change my approach and include my physical lab in the process of learning vSAN.
If all goes as I would like I will get storage performance without sacrificing too many resources for it and this would be awesome.
Here is my current hardware setup in terms of disks:

1 Samsung SSD 840 PRO Series
1 Samsung SSD 830
3 Seagate Barracuda ST31000524AS 1TB 7200 RPM 32MB Cache SATA 6.0Gb/s 3.5″

I also have another spare ST31000524AS that I might add later but that would require me to add a disk controller.
Speaking of which, my current controller (C602 AHCI – Patsburg) is not in the vSAN HCL and the queue depth is listed as a pretty depressive value of 31 (per port) but I am still just running a lab and I don’t really need to achieve production grade performance numbers; nevertheless I have been looking around on eBay and it seems like with about €100 I can get a supported disk controller but I decided to wait a few weeks to make sure VMware updates the HCL just because I don’t want to buy something that won’t be on vSphere6/vSAN6 HCL plus I might still get the performance I need with my current setup, or at least this is what I hope.

UPDATE: The controller I was keeping an eye on doesn’t seem to be listed in the HCL for vSAN 6 even now that the HCL is reported to be updated so be careful with your lab purchases!

For the time being I will test this environment on my current disk controller and learn how to troubleshoot performance bottlenecks in vSAN which is going to be a great exercise anyway.

The first thing to do in my case was to decommission the current disks, so once I delete the VSA that was using them as RDM I needed to make sure that the disks had no partitions left on them since this will create problems claiming them during the vSAN setup, so I accessed my ESXi via SSH and started playing around with the command line:

esxcli storage core device list      # list block storage devices

Which gave me a list of devices that I could use with vSAN (showing one disk only):

t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
   Display Name: Local ATA Disk (t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____)
   Has Settable Display Name: true
   Size: 244198
   Device Type: Direct-Access 
   Multipath Plugin: NMP
   Devfs Path: /vmfs/devices/disks/t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
   Vendor: ATA     
   Model: Samsung SSD 840 
   Revision: DXM0
   SCSI Level: 5
   Is Pseudo: false
   Status: on
   Is RDM Capable: false
   Is Local: true
   Is Removable: false
   Is SSD: true
   Is VVOL PE: false
   Is Offline: false
   Is Perennially Reserved: false
   Queue Full Sample Size: 0
   Queue Full Threshold: 0
   Thin Provisioning Status: yes
   Attached Filters: 
   VAAI Status: unknown
   Other UIDs: vml.0100000000533132524e45414342303639373142202020202053616d73756e
   Is Shared Clusterwide: false
   Is Local SAS Device: false
   Is SAS: false
   Is USB: false
   Is Boot USB Device: false
   Is Boot Device: false
   Device Max Queue Depth: 31
   No of outstanding IOs with competing worlds: 32
   Drive Type: unknown
   RAID Level: unknown
   Number of Physical Drives: unknown
   Protection Enabled: false
   PI Activated: false
   PI Type: 0
   PI Protection Mask: NO PROTECTION
   Supported Guard Types: NO GUARD SUPPORT
   DIX Enabled: false
   DIX Guard Type: NO GUARD SUPPORT
   Emulated DIX/DIF Enabled: false

This is useful to identify the SSD devices, the device names and their physical path. Here’s a recap of the useful information in my environment:

/vmfs/devices/disks/t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
/vmfs/devices/disks/t10.ATA_____SAMSUNG_SSD_830_Series__________________S0VYNYABC03672______
/vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________5VPDP87L
/vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________5VPDP8N3
/vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________9VPC5AQ9

The Samsung 840 Pro will give me much better performance in a vSAN diskgroup so I will put aside the 830 for now.

Now for each and every disk I check the presence of partitions and removed all of them if any; I’m going to show you the commands I run against one disk as an example:

~ # partedUtil getptbl /vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________5VPDP87L
gpt
121601 255 63 1953525168
1 34 262177 E3C9E3160B5C4DB8817DF92DF00215AE microsoftRsvd 0
2 264192 1953519615 5085BD5BA7744D76A916638748803704 unknown 0

~ # partedUtil delete /vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________5VPDP87L 2

~ # partedUtil delete /vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________5VPDP87L 1

~ # partedUtil getptbl /vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________5VPDP87L
gpt
121601 255 63 1953525168

partedUtil is used to manage partitions, the “getptbl” shows the partitions (2 in this case) and the delete command removes them; note how in the end of these commands I needed to specify the partition number on top of which I wanted to execute the operation.

At that point with all the disks ready I needed to change the default vSAN policy because otherwise I wouldn’t be able to satisfy the 3-nodes requirement, so I needed to enable the “ForceProvisioning” setting.
Considering that at some point vSAN will need to destage writes from SSD to HDD I also decided to enable StripeWidth and set it to “3” so I could take advantage of all of my 3 HDD when IOs involve the magnetic disks.
Please note that this is probably a good idea in a lab while in a production environment you will need to find good reasons for this since VMware encourages customers to leave the default value at “1”; problems to consider comes into play when you are doing the sizing of your environment (careful about components number even if vSAN 6 raised the per host limit from 3000 to 9000), in general your should read the “VMware Virtual SAN 6.0
Design and Sizing Guide” (http://goo.gl/BePpyI) before making any architectural decision.

To change vSAN default policy and create a cluster I made very minor changes to William Lam steps described here for vSAN 1.0:

esxcli vsan policy getdefault      # display the current settings

esxcli vsan policy setdefault -c cluster -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1) (\"stripeWidth\" i3))"
esxcli vsan policy setdefault -c vdisk -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1) (\"stripeWidth\" i3))"
esxcli vsan policy setdefault -c vmnamespace -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1) (\"stripeWidth\" i3))"
esxcli vsan policy setdefault -c vmswap -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1) (\"stripeWidth\" i3))"
esxcli vsan policy setdefault -c vmem -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1) (\"stripeWidth\" i3))"

esxcli vsan policy getdefault      # check that the changes made are active

This is when I created the vSAN cluster comprised of one node:

esxcli vsan cluster new
esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-21T10:23:14Z
Local Node UUID: 51a90242-c628-b3bc-4f8d-6805ca180c29
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 51a90242-c628-b3bc-4f8d-6805ca180c29
Sub-Cluster Backup UUID:
Sub-Cluster UUID: 52b2e982-fd0f-bc1a-46a0-2159f081c93d
Sub-Cluster Membership Entry Revision: 0
Sub-Cluster Member UUIDs: 51a90242-c628-b3bc-4f8d-6805ca180c29
Sub-Cluster Membership UUID: 34430d55-4b18-888a-00a7-74d02b27faf8

I was good to add the disks to a diskgroup now, remember that in every diskgroup there is 1 SSD and one or more HDD:

[root@esxi:~] esxcli vsan storage add -s t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____ -d t10.ATA_____ST31000524AS________________________________________5VPDP87L

[root@esxi:~] esxcli vsan storage add -s t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____ -d t10.ATA_____ST31000524AS________________________________________5VPDP8N3

[root@esxi:~] esxcli vsan storage add -s t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____ -d t10.ATA_____ST31000524AS________________________________________9VPC5AQ9

I had no errors, so I checked the vSAN storage to see what was composed of:

esxcli vsan storage list
t10.ATA_____ST31000524AS________________________________________5VPDP87L
Device: t10.ATA_____ST31000524AS________________________________________5VPDP87L
Display Name: t10.ATA_____ST31000524AS________________________________________5VPDP87L
Is SSD: false
VSAN UUID: 527ae2ad-7572-3bf7-4d57-546789dd7703
VSAN Disk Group UUID: 52e56e97-d27b-6d9b-d1fe-c73da8082ccc
VSAN Disk Group Name: t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Used by this host: true
In CMMDS: true
Checksum: 2442595905156199819
Checksum OK: true
Emulated DIX/DIF Enabled: false

t10.ATA_____ST31000524AS________________________________________9VPC5AQ9
Device: t10.ATA_____ST31000524AS________________________________________9VPC5AQ9
Display Name: t10.ATA_____ST31000524AS________________________________________9VPC5AQ9
Is SSD: false
VSAN UUID: 52e06341-1491-13ea-4816-c6e6338316dc
VSAN Disk Group UUID: 52e56e97-d27b-6d9b-d1fe-c73da8082ccc
VSAN Disk Group Name: t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Used by this host: true
In CMMDS: true
Checksum: 1139180948185469177
Checksum OK: true
Emulated DIX/DIF Enabled: false

t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Device: t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Display Name: t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Is SSD: true
VSAN UUID: 52e56e97-d27b-6d9b-d1fe-c73da8082ccc
VSAN Disk Group UUID: 52e56e97-d27b-6d9b-d1fe-c73da8082ccc
VSAN Disk Group Name: t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Used by this host: true
In CMMDS: true
Checksum: 10619796523455951412
Checksum OK: true
Emulated DIX/DIF Enabled: false

t10.ATA_____ST31000524AS________________________________________5VPDP8N3
Device: t10.ATA_____ST31000524AS________________________________________5VPDP8N3
Display Name: t10.ATA_____ST31000524AS________________________________________5VPDP8N3
Is SSD: false
VSAN UUID: 52f501d7-ac52-ffa4-a45b-5c33d62039a1
VSAN Disk Group UUID: 52e56e97-d27b-6d9b-d1fe-c73da8082ccc
VSAN Disk Group Name: t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Used by this host: true
In CMMDS: true
Checksum: 7613613771702318357
Checksum OK: true
Emulated DIX/DIF Enabled: false

At this point I could see my “vsanDatastore” in the vSphere Client. (I had no vCenter yet)

The next step will be to deploy vCenter on this datastore; I will be using VCSA and I will show you how to do it with a Mac.

How to revert new Transparent Page Sharing behaviour

There have been a lot of blog posts around warning us that starting from vSphere 5.5 Update2d (build 2403361) TPS will be basically turned off.

Besides comments about the why and how to partially re-enable inter-VM TPS I just think in most cases it might make sense to restore the original behaviour; I have been in the process of upgrading an infrastructure where the RAM contrain represented a much bigger concearn than the specific security reason why TPS has been turned off so I decided to take a look at the shared memory metrics in vCenter just to find out that post-upgrade situation wouldn’t be exactly heavenly if all the shared memory pages would be translated into unique pages.

I found this arcicle KB2097593 where it explains the various status of the settings you can apply and one of them basically disables the new behaviour restoring the old fashion TPS:

Featured image

In other words setting the value “ShareForceSalting” to 0 will bring things back in time.

The complete procedure is as follows:

1. Log in to ESX (i)/vCenter with the VI-Client.
2. Select ESX (i) relevant host.
3. In the Configuration tab, click Advanced Settings (link) under the software section.
4. In the Advanced Settings window, click Mem.
5. Search for Mem.ShareForceSalting and set the value to 0.
6. Click OK.
7. Reboot host

vSphere 6 Certificate Lifecycle Management

Recently I’ve been fighting with a vSphere environment and CA certificates and I thought a lot about certificate management and lifecycle in a VMware vSphere environment after that and how much it needs improvement. With the SSL Certificate Automation Tool VMware made a step in the right direction and even if the tools itself is sometimes a little buggy it is still very handy in automating a long and error prone process. In vSphere 6 VMware is taking another step in the right direction to help us create, apply and manage SSL certificates in a vSphere environment, but before talking about this we need to talk a bit about what’s new in SSO and vCenter architecture in vSphere 6. Since the introduction of SSO VMware changed its architecture in every major release, starting from 5.1 to 5.5 and now to 6.0 so let’s make a little bit of history:

Featured image

The new vSphere 6 management architecture introduces two main roles that you can deploy, these are the Management Node and the Platform Service Controller (PSC); the reason behind this separation is to have a logical entity that will take care of the main management features while another entity will hold the core and security features of the solution. What is nice about this separation is that you don’t need a 1:1 ratio between Management Nodes and PSCs so you can install PSC on separate boxes and replicate between them and then have as many Management nodes as you need (as long as you are within the same SSO domain)

Featured image

For HA scenario if you install PSC on separate boxes you will still need a load balancer. Supported solutions are Big-IP F5 and NetScaler so far.

You can obviously still install everything in one box:

Featured image

You might have noticed that the HA model for SSO was active/passive in 5.1, then active/active in 5.5 and now is active/passive again; this is due to the re-engineering of the Secure Token Service (STS) which is moving to a new and more robust method of STS (known as WebSSO) which is the same already used by vCAC (or vRealize Automation if you will) and that will be used from now forward instead of the old 5.5 method (WS-Trust). Let’s see how services are spread out on each role:

Let’s take a look to the services within the Management Node and the PSC:

Featured image

In the Management Node we can find services and features that every vSphere Admin feels very comfortable and familiar with such as vCenter Server, vSphere Web Client, Syslog Collector, etc., but two of them deserve a few words:

  • Virtual Datacenter Service: this service is new and it has been introduced to help mitigate the limitation connected with the Datacenter object in vCenter as a Management boundary.
  • (Optional) vPostgres: This component is obviously referring to the vCenter Appliance (thus optional) but I believe more and more new deployments or upgrades deserve to be considered a good fit for vCSA since VMware announced complete equality of features between vCenter installed on Windows and vCSA; leave alone the fuss of dedicating Windows licenses for vCenter which might not be a huge problem I just find the process of patching ad upgrading a vCSA simply amazing and it’s not a secret that products like EVO:RAIL make extensive use of vCSA. VMware wants to move all their services deployment model towards Virtual Appliances, this is not a secret and we need to get used to it, the sooner the better, but I’m digressing…

Featured image

In the Platform Service Controller or PSC we find our old friend SSO (we have had a rough past but now we are on better terms) and quite a few new services:

  • VMware Single Sign-On
    • Secure Token Service (STS)
    • Identity Management Service (IdM)
    • Directory Service (VMDir)
  • VMware Certificate Authority (VMCA)
  • VMware Endpoint Certificate Store (VECS)
  • VMware Licensing Service
  • Authentication Framework Daemon (AFD)
  • Component Manager Service (CM)
  • HTTP Reverse Proxy

Describing all these services is out of the scope of this post but as you probably guess two of them will be our focus: the VMware Certificate Authority (or VMCA) and the VMware Endpoint Certificate Store (or VECS). But what are the roles of VMCA and VECS? The VMCA is no more or less than a CA, so you can:

  • Generate Certificates
  • Generate CRLs
  • Use the UI
  • Use the Command Line Interface to replace certificates

The VECS is where all certificates within the PSC are stored, with the only exception of the ESXi certificates that are stored locally on vSphere hosts, so here you can:

  • Store certificates and keys
  • Sync trusted certificates
  • Sync CRLs
  • Use the UI
  • Use the CLI to perform various actions

Since VMCA and VECS are part of the PSC, they will take advantage of the Multi-Master Replication Model which is offered by the Directory Service (VMDir) in order to achieve HA. In the past every service had its own user and required its own certificate but this is not the case anymore since we now have Solution Users (SU); since the number of services has increased significantly it would be impractical to manage the lifecycle of this many certificates so now we have 4 main SU that will hold the certificate used for a number of services.

Voila_Capture 2015-01-08_08-14-37_pm_white_background

What about use cases/scenarios in which I can implement VMCA? In what ways you can use this new tool?

Featured image

Scenario 1 and 2 are similar: the VMCA is the CA that releases certificates for all Solution Users (SU), the only difference is that in scenario 1 the VMCA is the root CA and you will need to distribute the Root CA Certificate so that all corporate browsers will trust it, while in scenario 2 the VMCA becomes part of an existing PKI as a subordinate CA and you certificate trust.

Featured image

In scenario 3 VMCA is installed but not used, CSRs are created and submitted to an external CA and VECS will be used to store certificates in PEM format.

Featured image

My favorite is scenario 2 because most enterprises I see already have a PKI (Microsoft CA usually) and all clients already trust the CA certificates, so adding the VMCA as s subordinate is a non disruptive process with a very low maintenance impact on the PKI itself, it protects investments already made to implement the current PKI and  preserves the knowledge to run the PKI.

Replacing certificates is still a CLI task (looks like Powershell will be involved) but VMCA and VECS are a very promising step toward the right direction for simplifying certificate lifecycle management in a vSphere environment.

ESXi 5.5 Update 2 build 2143827 break Citrix NetScaler

If you happen to run NetScaler and upgrade to build 2143827 of vSphere 5.5 you will run in very nasty issues.

Specifically, your NetScaler will go bonkers if you enter in the management web interface or even via SSH; in general all things that start a network increase in term of traffic directed to the NetScaler will make it crash.

When it doesn’t crash you will still experience network communication drops, hangs, connectivity issues and such and in a very random fashion too (just to add some fun).

Fortunately VMware gave us a workaround.

To work around this issue, add the line hw.em.txd=512 in loader.conf file.

To add the line hw.em.txd=512 in loader.conf file:

    1. Log in to the Citrix NetScaler virtual machine appliance as root
    2. Locate the loader.conf file on the Netscaler virtual machine appliance by running this command:find / -name loader.conf

Note: There are two loader.conf files. The two files are ./flash/boot/defaults/loader.conf and ./flash/boot/loader.conf.
Modify the first one and ensure to take a backup of the file prior to making the changes.

  1. Add this line in the loader.conf filehw.em.txd=512
  2. Save the changes
  3. Restart the NetScaler virtual machine appliance

For deeper details about the issue read KB VMware KB 2092809

Powershell Script for Shutting Down your vSphere Environment

Every now and then I need to setup an environment so that if a power outage occurs out of business hours there is a sort of automation taking care of that gracefully shutting down all VMs and Hosts to prevent failures.

For some reason I can never use older scripts I already have because whether they are too old and need some rewriting or because they don’t take into account some aspect of the specific environment I’m working on; one way or another it always feels like starting over.

While I was about to write a new one I stumbled across this blog post by Mike Preston and I was happy to find a very straightforward, simple and yet very complete script which would cover most of your (and mine too) needs.

Here are the aspects I was positively impressed about:

– it keeps into account HA
– it keeps into account DRS and stops it from moving things around while shutting down everything (nice)
– it allows you to specify if vCenter is virtual and it takes care of it for last, including the host where vCenter resides (nice!!)
– it keeps into account that VMs might just not shutdown gracefully for some reason; this is a major reason why most scripts would fail (see VDI streamed VMs)
– it writes events in a log file which is nice to go and look at in case you want to know what happened once you’re back online
– it dumps to file the list of powered on VMs at the moment of shutdown so you can manually turn everything back on as it was or take advantage of another nice script wrote by Mike called “poweronvms”

Ever if this script is definitely one of the most complete it is still lacking some of the requirements I needed, so with Mike permission (thanks Mike!) I added some things:

– added check for loading VMware PowerCLI Snap-In
– support for vApps
– removed annoying error messages when variables are = null
– added support for encrypted password dumped on a file (I don’t like password in clear text in scripts)
– added date-time in the log file for each event
– minor cosmetic in the output
– tested with vsphere 5.5 U1

Before we start using the script we need to create a password file so that passwords are not stored in clear text in the script itself:

mkdir c:\shutdown
Read-Host -Prompt "Enter password" -AsSecureString | ConvertFrom-SecureString | out-file c:\shutdown\cred.txt
(type in the password of the user you want to use for running the script)
cat c:\shutdown\cred.txt

The output should be something like this:

01000000d08c9ddf0115d1118c7a00c04fc297eb0100000092295625f5c35b4bb8af99be46e6679d0000000002000000000003660000c0000000100
000007b8d21c33686ef6ec751c0f671e7ff510000000004800000a00000001000000022d4534785ecda5799047d48f0f79a4618000000a1376ef3bf
fa5a3a928bee66d68eab5d49b12308f9d0b63f140000002ca923ea8eb54ff11c5864d9f722931a4ec85597

This is a hash of the password calculated using the current Windows credentials and it’s also connected with the Windows machine you run the command from, so this password file is not portable and can be used only but the user who generated it.

The script is already set up to use such password file, just edit the variables in the top of the script accordingly.

Here it is:

#################################################################################
# Power off VMs (poweroffvms.ps1)
#
# This script does have a 'partner' script that powers the VMs back on, you can
# grab that script at http://blog.mwpreston.net/shares/
#
# Created By: Mike Preston, 2012 - With a whole lot of help from Eric Wright
#                                  (@discoposse)
#
# Variables:  $vcenter - The IP/DNS of your vCenter Server
#             $username/password - credentials for your vCenter Server
#             $filename - path to csv file to store powered on vms
#			  used for the poweronvms.ps1 script.
#             $cluster - Name of specific cluster to target within vCenter
#             $datacenter - Name of specific datacenter to target within vCenter
#
#################################################################################

if ( (Get-PSSnapin -Name VMware.VimAutomation.Core -ErrorAction SilentlyContinue) -eq $null )
{
	Add-PsSnapin VMware.VimAutomation.Core
}

# Some variables
$vcenter = "<vcenter_fqdn>"
$username = "<vcenter_username>"
$password = get-content C:\shutdown\cred.txt | convertto-securestring
$credentials = new-object System.Management.Automation.PSCredential $username, $password
$cluster = "<cluster_name>"
$datacenter = "<datacenter_name>"
$filename = "c:\shutdown\poweredonvms.csv"
$time = ( get-date ).ToString('HH-mm-ss')
$date = ( get-date ).ToString('dd-MM-yyyy')
$logfile = New-Item -type file "C:\shutdown\ShutdownLog-$date-$time.txt" -Force

# For use with a virtualized vCenter - bring it down last 🙂 - if you don't need this functionality
# simply leave the variable set to NA and the script will work as it always had
$vCenterVMName = "NA"

Write-Host ""
Write-Host "Shutdown command has been sent to the vCenter Server." -Foregroundcolor yellow
Write-Host "This script will shutdown all of the VMs and hosts located in $datacenter." -Foregroundcolor yellow
Write-Host "Upon completion, this server ($vcenter) will also be shutdown gracefully." -Foregroundcolor yellow
Write-Host ""
Sleep 5

Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) PowerOff Script Engaged"
Add-Content $logfile ""

# Connect to vCenter
Write-Host "Connecting to vCenter - $vcenter.... " -nonewline
Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) Connecting to vCenter - $vcenter"
$success = Connect-VIServer $vcenter -Credential $credentials -WarningAction:SilentlyContinue
if ($success) { Write-Host "Connected!" -Foregroundcolor Green }
else
{
    Write-Host "Something is wrong, Aborting script" -Foregroundcolor Red
    Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) Something is wrong, Aborting script"
    exit
}
Write-Host ""
Add-Content $logfile  ""

# Turn Off vApps
Write-Host "Stopping VApps...." -Foregroundcolor Green
Write-Host ""
Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) Stopping VApps..."
Add-Content $logfile ""

$vapps = Get-VApp | Where { $_.Status -eq "Started" }

if ($vapps -ne $null)
{
	ForEach ($vapp in $vapps)
	{
			Write-Host "Processing $vapp.... " -ForegroundColor Green
			Write-Host ""
			Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) Stopping $vapp."
			Add-Content $logfile ""
			Stop-VApp -VApp $vapp -Confirm:$false | out-null
			Write-Host "$vapp stopped." -Foregroundcolor Green
			Write-Host ""
	}
}

Write-Host "VApps stopped." -Foregroundcolor Green
Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) VApps stopped."
Add-Content $logfile ""

# Get a list of all powered on VMs - used for powering back on....
Get-VM -Location $cluster | where-object {$_.PowerState -eq "PoweredOn" } | Select Name | Export-CSV $filename

# Change DRS Automation level to partially automated...
Write-Host "Changing cluster DRS Automation Level to Partially Automated" -Foregroundcolor green
Get-Cluster $cluster | Set-Cluster -DrsAutomation PartiallyAutomated -confirm:$false 

# Change the HA Level
Write-Host ""
Write-Host "Disabling HA on the cluster..." -Foregroundcolor green
Write-Host ""
Add-Content $logfile "Disabling HA on the cluster..."
Add-Content $logfile ""
Get-Cluster $cluster | Set-Cluster -HAEnabled:$false -confirm:$false 

# Get VMs again (we will do this again instead of parsing the file in case a VM was powered in the nanosecond that it took to get here.... 🙂
Write-Host ""
Write-Host "Retrieving a list of powered on guests...." -Foregroundcolor Green
Write-Host ""
Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) Retrieving a list of powered on guests...."
Add-Content $logfile ""
$poweredonguests = Get-VM -Location $cluster | where-object {$_.PowerState -eq "PoweredOn" }

Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) Checking to see if vCenter is virtualized"

# Retrieve host info for vCenter
if ($vcenterVMName -ne "NA")
{
    Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) vCenter is indeed virtualized, getting ESXi host hosting vCenter Server"
    $vCenterHost = (Get-VM $vCenterVMName).Host.Name
    Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) $vCenterVMName currently running on $vCenterHost - will process this last"
}
else
{
    Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) vCenter is not virtualized"
}

Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) Proceeding with VM PowerOff"
Add-Content $logfile ""

# And now, let's start powering off some guests....
ForEach ( $guest in $poweredonguests )
{
    if ($guest.Name -ne $vCenterVMName)
    {
        Write-Host "Processing $guest.... " -ForegroundColor Green
        Write-Host "Checking for VMware tools install" -Foregroundcolor Green
        $guestinfo = get-view -Id $guest.ID
        if ($guestinfo.config.Tools.ToolsVersion -eq 0)
        {
            Write-Host "No VMware tools detected in $guest , hard power this one" -ForegroundColor Yellow
            Write-Host ""
            Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) $guest - no VMware tools, hard power off"
            Stop-VM $guest -confirm:$false | out-null
        }
        else
        {
           write-host "VMware tools detected.  I will attempt to gracefully shutdown $guest"
           Write-Host ""
           Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) $guest - VMware tools installed, gracefull shutdown"
           $vmshutdown = $guest | shutdown-VMGuest -Confirm:$false | out-null
        }
    }
}

# Let's wait a minute or so for shutdowns to complete
Write-Host ""
Write-Host "Giving VMs 2 minutes before resulting in hard poweroff"
Write-Host ""
Add-Content $logfile ""
Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) Waiting a couple minutes then hard powering off all remaining VMs"
Sleep 120

# Now, let's go back through again to see if anything is still powered on and shut it down if it is
Write-Host "Beginning Phase 2 - anything left on.... night night...." -ForegroundColor red
Write-Host ""

# Get our list of guests still powered on...

$poweredonguests = Get-VM -Location $cluster | where-object {$_.PowerState -eq "PoweredOn" }

if ($poweredonguests -ne $null)
{
	ForEach ( $guest in $poweredonguests )
	{
		if ($guest.Name -ne $vCenterVMName)
		{
			Write-Host "Processing $guest ...." -ForegroundColor Green
			#no checking for toosl, we just need to blast it down...
			write-host "Shutting down $guest - I don't care, it just needs to be off..." -ForegroundColor Yellow
                        Write-Host ""
			Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) $guest - Hard Power Off"
			Stop-VM $guest -confirm:$false | out-null
		}
	}
}

# Wait 30 seconds
Write-Host "Waiting 30 seconds and then proceding with host power off"
Write-Host ""
Add-Content $logfile ""
Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) Processing power off of all hosts now"
Sleep 30

# and now its time to slam down the hosts - I've chosen to go by datacenter here but you could put the cluster
# There are some standalone hosts in the datacenter that I would also like to shutdown, those vms are set to
# start and stop with the host, so i can just shut those hosts down and they will take care of the vm shutdown 😉

$esxhosts = Get-VMHost -Location $cluster
foreach ($esxhost in $esxhosts)
{
    if ($esxhost.Name -ne $vCenterHost)
    {
        #Shutem all down
        Write-Host "Shutting down $esxhost" -ForegroundColor Green
        Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) Shutting down $esxhost"
        $esxhost | Foreach {Get-View $_.ID} | Foreach {$_.ShutdownHost_Task($TRUE)}
    }
}

# Finally shut down the host which vCenter resides on if using that functionality
if ($vcenterVMName -ne "NA")
{
    Add-Content $logfile ""
    Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) Processing $vcenterHost - shutting down "
    Get-VMhost $vCenterHost | Foreach {Get-View $_.ID} | Foreach {$_.ShutdownHost_Task($TRUE)} | out-null
	Write-Host "Shutdown Complete." -ForegroundColor Green
}

Add-Content $logfile ""
Add-Content $logfile "$(get-date -f dd/MM/yyyy) $(get-date -f HH:mm:ss) All done!"
# That's a wrap

Just to say something obvious, test!

I used vCenter Simulator 2.0 for initial testing and once happy I used a real environment.

Now go and turn off all the vSphere environments you can! (just kidding)

(Seriously, I’m just kidding, don’t do it! Really! Just don’t)

%d bloggers like this: