Running a Home Lab on a Single vSAN Node

This is how I managed to run my lab on a single vSAN node and manage it completely Windows free, which is always a goal for a Mac user like me; with vSphere 6 this is a lot easier than it used to be in the past thanks to improvement in the Web Client (and the fact that the fat client doesn’t connect to vCenter anymore) and also thanks to the new VCSA that comes with deployment tools for Mac.
About the storage side of things, I’ve always been running my lab with some kind of virtual storage appliance in the past (Nexenta, Atlantis, Datacore) but those require a lot of memory and processing power and this reduces the number of VMs I can run in my lab simultaneously.
It’s true that I can get storage acceleration like this (which is so important in a home lab) but I sacrifice consolidation ratio and add complexity to take into account when I do upgrades and maintenance, so I decided to change my approach and include my physical lab in the process of learning vSAN.
If all goes as I would like I will get storage performance without sacrificing too many resources for it and this would be awesome.
Here is my current hardware setup in terms of disks:

1 Samsung SSD 840 PRO Series
1 Samsung SSD 830
3 Seagate Barracuda ST31000524AS 1TB 7200 RPM 32MB Cache SATA 6.0Gb/s 3.5″

I also have another spare ST31000524AS that I might add later but that would require me to add a disk controller.
Speaking of which, my current controller (C602 AHCI – Patsburg) is not in the vSAN HCL and the queue depth is listed as a pretty depressive value of 31 (per port) but I am still just running a lab and I don’t really need to achieve production grade performance numbers; nevertheless I have been looking around on eBay and it seems like with about €100 I can get a supported disk controller but I decided to wait a few weeks to make sure VMware updates the HCL just because I don’t want to buy something that won’t be on vSphere6/vSAN6 HCL plus I might still get the performance I need with my current setup, or at least this is what I hope.

UPDATE: The controller I was keeping an eye on doesn’t seem to be listed in the HCL for vSAN 6 even now that the HCL is reported to be updated so be careful with your lab purchases!

For the time being I will test this environment on my current disk controller and learn how to troubleshoot performance bottlenecks in vSAN which is going to be a great exercise anyway.

The first thing to do in my case was to decommission the current disks, so once I delete the VSA that was using them as RDM I needed to make sure that the disks had no partitions left on them since this will create problems claiming them during the vSAN setup, so I accessed my ESXi via SSH and started playing around with the command line:

esxcli storage core device list      # list block storage devices

Which gave me a list of devices that I could use with vSAN (showing one disk only):

t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
   Display Name: Local ATA Disk (t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____)
   Has Settable Display Name: true
   Size: 244198
   Device Type: Direct-Access 
   Multipath Plugin: NMP
   Devfs Path: /vmfs/devices/disks/t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
   Vendor: ATA     
   Model: Samsung SSD 840 
   Revision: DXM0
   SCSI Level: 5
   Is Pseudo: false
   Status: on
   Is RDM Capable: false
   Is Local: true
   Is Removable: false
   Is SSD: true
   Is VVOL PE: false
   Is Offline: false
   Is Perennially Reserved: false
   Queue Full Sample Size: 0
   Queue Full Threshold: 0
   Thin Provisioning Status: yes
   Attached Filters: 
   VAAI Status: unknown
   Other UIDs: vml.0100000000533132524e45414342303639373142202020202053616d73756e
   Is Shared Clusterwide: false
   Is Local SAS Device: false
   Is SAS: false
   Is USB: false
   Is Boot USB Device: false
   Is Boot Device: false
   Device Max Queue Depth: 31
   No of outstanding IOs with competing worlds: 32
   Drive Type: unknown
   RAID Level: unknown
   Number of Physical Drives: unknown
   Protection Enabled: false
   PI Activated: false
   PI Type: 0
   PI Protection Mask: NO PROTECTION
   Supported Guard Types: NO GUARD SUPPORT
   DIX Enabled: false
   DIX Guard Type: NO GUARD SUPPORT
   Emulated DIX/DIF Enabled: false

This is useful to identify the SSD devices, the device names and their physical path. Here’s a recap of the useful information in my environment:

/vmfs/devices/disks/t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
/vmfs/devices/disks/t10.ATA_____SAMSUNG_SSD_830_Series__________________S0VYNYABC03672______
/vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________5VPDP87L
/vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________5VPDP8N3
/vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________9VPC5AQ9

The Samsung 840 Pro will give me much better performance in a vSAN diskgroup so I will put aside the 830 for now.

Now for each and every disk I check the presence of partitions and removed all of them if any; I’m going to show you the commands I run against one disk as an example:

~ # partedUtil getptbl /vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________5VPDP87L
gpt
121601 255 63 1953525168
1 34 262177 E3C9E3160B5C4DB8817DF92DF00215AE microsoftRsvd 0
2 264192 1953519615 5085BD5BA7744D76A916638748803704 unknown 0

~ # partedUtil delete /vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________5VPDP87L 2

~ # partedUtil delete /vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________5VPDP87L 1

~ # partedUtil getptbl /vmfs/devices/disks/t10.ATA_____ST31000524AS________________________________________5VPDP87L
gpt
121601 255 63 1953525168

partedUtil is used to manage partitions, the “getptbl” shows the partitions (2 in this case) and the delete command removes them; note how in the end of these commands I needed to specify the partition number on top of which I wanted to execute the operation.

At that point with all the disks ready I needed to change the default vSAN policy because otherwise I wouldn’t be able to satisfy the 3-nodes requirement, so I needed to enable the “ForceProvisioning” setting.
Considering that at some point vSAN will need to destage writes from SSD to HDD I also decided to enable StripeWidth and set it to “3” so I could take advantage of all of my 3 HDD when IOs involve the magnetic disks.
Please note that this is probably a good idea in a lab while in a production environment you will need to find good reasons for this since VMware encourages customers to leave the default value at “1”; problems to consider comes into play when you are doing the sizing of your environment (careful about components number even if vSAN 6 raised the per host limit from 3000 to 9000), in general your should read the “VMware Virtual SAN 6.0
Design and Sizing Guide” (http://goo.gl/BePpyI) before making any architectural decision.

To change vSAN default policy and create a cluster I made very minor changes to William Lam steps described here for vSAN 1.0:

esxcli vsan policy getdefault      # display the current settings

esxcli vsan policy setdefault -c cluster -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1) (\"stripeWidth\" i3))"
esxcli vsan policy setdefault -c vdisk -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1) (\"stripeWidth\" i3))"
esxcli vsan policy setdefault -c vmnamespace -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1) (\"stripeWidth\" i3))"
esxcli vsan policy setdefault -c vmswap -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1) (\"stripeWidth\" i3))"
esxcli vsan policy setdefault -c vmem -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1) (\"stripeWidth\" i3))"

esxcli vsan policy getdefault      # check that the changes made are active

This is when I created the vSAN cluster comprised of one node:

esxcli vsan cluster new
esxcli vsan cluster get
Cluster Information
Enabled: true
Current Local Time: 2015-03-21T10:23:14Z
Local Node UUID: 51a90242-c628-b3bc-4f8d-6805ca180c29
Local Node State: MASTER
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 51a90242-c628-b3bc-4f8d-6805ca180c29
Sub-Cluster Backup UUID:
Sub-Cluster UUID: 52b2e982-fd0f-bc1a-46a0-2159f081c93d
Sub-Cluster Membership Entry Revision: 0
Sub-Cluster Member UUIDs: 51a90242-c628-b3bc-4f8d-6805ca180c29
Sub-Cluster Membership UUID: 34430d55-4b18-888a-00a7-74d02b27faf8

I was good to add the disks to a diskgroup now, remember that in every diskgroup there is 1 SSD and one or more HDD:

[root@esxi:~] esxcli vsan storage add -s t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____ -d t10.ATA_____ST31000524AS________________________________________5VPDP87L

[root@esxi:~] esxcli vsan storage add -s t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____ -d t10.ATA_____ST31000524AS________________________________________5VPDP8N3

[root@esxi:~] esxcli vsan storage add -s t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____ -d t10.ATA_____ST31000524AS________________________________________9VPC5AQ9

I had no errors, so I checked the vSAN storage to see what was composed of:

esxcli vsan storage list
t10.ATA_____ST31000524AS________________________________________5VPDP87L
Device: t10.ATA_____ST31000524AS________________________________________5VPDP87L
Display Name: t10.ATA_____ST31000524AS________________________________________5VPDP87L
Is SSD: false
VSAN UUID: 527ae2ad-7572-3bf7-4d57-546789dd7703
VSAN Disk Group UUID: 52e56e97-d27b-6d9b-d1fe-c73da8082ccc
VSAN Disk Group Name: t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Used by this host: true
In CMMDS: true
Checksum: 2442595905156199819
Checksum OK: true
Emulated DIX/DIF Enabled: false

t10.ATA_____ST31000524AS________________________________________9VPC5AQ9
Device: t10.ATA_____ST31000524AS________________________________________9VPC5AQ9
Display Name: t10.ATA_____ST31000524AS________________________________________9VPC5AQ9
Is SSD: false
VSAN UUID: 52e06341-1491-13ea-4816-c6e6338316dc
VSAN Disk Group UUID: 52e56e97-d27b-6d9b-d1fe-c73da8082ccc
VSAN Disk Group Name: t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Used by this host: true
In CMMDS: true
Checksum: 1139180948185469177
Checksum OK: true
Emulated DIX/DIF Enabled: false

t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Device: t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Display Name: t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Is SSD: true
VSAN UUID: 52e56e97-d27b-6d9b-d1fe-c73da8082ccc
VSAN Disk Group UUID: 52e56e97-d27b-6d9b-d1fe-c73da8082ccc
VSAN Disk Group Name: t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Used by this host: true
In CMMDS: true
Checksum: 10619796523455951412
Checksum OK: true
Emulated DIX/DIF Enabled: false

t10.ATA_____ST31000524AS________________________________________5VPDP8N3
Device: t10.ATA_____ST31000524AS________________________________________5VPDP8N3
Display Name: t10.ATA_____ST31000524AS________________________________________5VPDP8N3
Is SSD: false
VSAN UUID: 52f501d7-ac52-ffa4-a45b-5c33d62039a1
VSAN Disk Group UUID: 52e56e97-d27b-6d9b-d1fe-c73da8082ccc
VSAN Disk Group Name: t10.ATA_____Samsung_SSD_840_PRO_Series______________S12RNEACB06971B_____
Used by this host: true
In CMMDS: true
Checksum: 7613613771702318357
Checksum OK: true
Emulated DIX/DIF Enabled: false

At this point I could see my “vsanDatastore” in the vSphere Client. (I had no vCenter yet)

The next step will be to deploy vCenter on this datastore; I will be using VCSA and I will show you how to do it with a Mac.

Advertisements

9 Responses to Running a Home Lab on a Single vSAN Node

  1. Pingback: MyVirtuaLife.Net

  2. molikop says:

    Thank you for sharing, I’m trying to setup a single node VSAN cluster for my home lab. Should I go with 1x240GB or 2x240GB or 1x480GB disk for the SSD layer. I will also have 3x1TB HDD.

    • andreacasini says:

      vSAN requires 1 SSD per disk pool.
      If you have 3 magnetic disk you probably want to create 1 disk pool hence 1 SSD, so go for the biggest and fastest SSD you can find FOR WRITES because all SSD are good for reads.

      • molikop says:

        hank you for replying, so does it make any difference 1×240 vs 1×480? I’m guessing i’m going to be able to put more on the SSD tier with the bigger SSD but I want to make sure that it makes sense

  3. molikop says:

    thank you for replying, so does it make any difference 1×240 vs 1×480? I’m guessing i’m going to be able to put more on the SSD tier with the bigger SSD but I want to make sure that it makes sense

    • andreacasini says:

      The SSD tier does not partecipate to the capacity of the vSAN datastore but it is used 70% as read cache and 30% as write buffer so you don’t decide what to place or where.
      VMware best practice is to size the SSD as 10% of the magnetic tier so in your case it needs to be at least 300GB but the biggest the better because you can keep more in the read cache (and not reading from magnetic disk) and have more space for writes when it comes to it.

  4. Tim Smith says:

    I’m running single node VSAN too at home. I’ve noticed no issues spinning up VMs, but when I goto migrate a VM from NAS storage to the VSAN datastore, I get an error talking about fault domains. Same when deploying from a template. But, can provision a new VM no problem. Have policy all setup correctly. Can you test and see if you can deploy from a template, or migrate from another datastore to VSAN? Thanks!

    • andreacasini says:

      Tim,
      I don’t experience the problems you have.
      It would be helpful to know what error is coming up exactly and what do you mean by having policies all setup correclty.

      • Tim Smith says:

        Turned out to be a corrupted VSAN storage policy on vCenter. Had to recreate. Was interesting as it ignored some aspects of policy but not others. It though I had fault domains, which obviously I did not.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: