VMware Cloud Foundation 3.0

vCF3.0

It has been a while since i have written about one of my favorite products VMware Cloud Foundation so i decided to provide an update on what is new with the most recent version of the software which is vCF 3.0 (just announced at VMworld 2018)

There are some amazing new features , which include :

Builder VM :  The VIA appliance is no longer used to image the system , instead a brand new builder VM is used to perform the bring up process. This VM is a photon based OVA which includes all the binaries needed to complete the bring up process.

Workload Domains with multiple clusters : It is now possible to create multiple clusters within a workload domain.

Physical Hardware changes : It is now allowed to use different vendor and model ESXi hosts in the same rack. It is also possible to use any switching infrastructure that the customer chooses to use and is not bound by the vCF HCL. Note however that the customers network team need to setup the switches in this situation (the switches will not be imaged and configured by vCF)

Software Versions : The new component software versions include the following :

  • vCenter / ESXi /PSC / vSAN6.5 U2b
  • NSX 6.4.1
  • Log Insight 4.6.1
  • vRealize Operations 6.7
  • vRealize Automation 7.4
  • vRealize Suite Lifecycle Management 1.2

Network Pools : vCF 3.0 uses a pre-defined set of IP pools for vmkernel addresses for vSAN and vMotion which SDDC manager uses in configuring the esxi hosts.

NSX Hybrid Connect : Using NSX Hybrid Connect, the process of migrating large workloads into a vCF workload domain has been simplified. This feature creates seamless connectivity between sites and allows the customer to migrate workloads from legacy environments, private and public clouds into a VMware Cloud Foundation environment.

These are all very cool new features which make a great product even better!

Dell-EMC continue to provide an awesome fully engineered rack based system that is powered by vCF – check it out here :

https://www.dellemc.com/en-ie/converged-infrastructure/vxrack-system/vxrack-sddc.htm

hardware

Script to get VM information

Recently i wanted to get the following information for VM’s in my environment :

  1. VM Name
  2. IP Address
  3. DNS1
  4. DNS2

The script below helped me to get that information by requesting a resource pool name and entering the location where you want to resulting csv file to be saved to .

The script uses the invoke-vmscript cmdlet and a powershell block to get the required information and then saves the information neatly in a csv file.

$rp = Read-Host -Prompt “What Resource Pool do you want to gather information for”
$csv = Read-Host -Prompt “Please give the full path where you want the csv file to be saved”

$vms = get-resourcepool -Name $rp | get-vm | where {$_.Powerstate -eq ‘Poweredon’}

$shownet = @’
$net = get-wmiobject win32_networkadapterconfiguration
“{0}|{1}|{2}” -f @(($net | where{$_.IPaddress} | select -expandproperty IPaddress | where{$_ -notmatch ‘:’}),
@($net | where{$_.dnsserversearchorder} | select -expandproperty dnsserversearchorder)[0],
@($net | where{$_.dnsserversearchorder} | select -expandproperty dnsserversearchorder)[1])
‘@

$report = foreach ($vm in $vms){

$result = invoke-vmscript -vm $vm -ScriptText $shownet -ScriptType powershell | select -ExpandProperty scriptoutput
$resultarray = $result.Trimend(“`r`n”).Split(‘|’)
new-object PSObject -Property @{
vm = $vm.name
IP = $resultarray[0]
DNS1 = $resultarray[1]
DNS2 = $resultarray[2]
}

}

$report | export-csv -Path $csv\report.csv -NoTypeInformation -UseCulture

 

Data drive missing after OS upgrade

I came across an interesting issue recently that i thought might be useful to share.

When we upgraded some Windows server 2008 VM’s to Windows 2012 the data drive (D:) was missing after the upgrade. The data drive was interestingly attached to a Paravirtual SCSI controller whereas the C: drive was attached to a standard LSI Logic SAS storage adapter.

Paravirtual SCSI controllers are high performance storage adapters which can result in greater throughput and are suitable for applications that require high I/O.

More information on these can be found here :

https://kb.vmware.com/s/article/1010398

Para

After further investigation , device manager showed some problems :

devicemanager

The VMware Paravirtual SCSI was also missing from the storage controllers

controllers

The fix for this issue was the following :

  1. Refresh / upgrade VMware tools on the affected VM’s. This procedure reinstalled the Paravirtual SCSI controller driver on the windows 2012 OS.

install

2. Bring the disk online using disk management or diskpart

online.png

The disk then showed up in my computer

drives

Brandon Lee also has an interesting post here on how to install the Paravirtual SCSI adapter driver when installing Windows Server 2016

https://www.virtualizationhowto.com/2017/01/windows-server-2016-install-vmware-paravirtual-scsi-controller/

VxRack SDDC bringup failing on “Backup bootbank of hosts” -vCF 2.2

I was recently re-imaging my VxRack SDDC a few days ago and came across an issue where the second phase bring up was failing . The error message was saying that the task that backs up the ESXi bootbanks  was failing

bootbank

After digging further, i realized that 192.168.100.46 was the IP address of the SDDC manager Utility VM and not a particular ESXi host.

I then examined the bringup log on the SDDC controller VM which is located at /opt/vmware/bringup/logs/evosddc-bringup.log and noticed that the original ssh connection from the SDDC manager controller VM to the utility VM was succeeding but was then throwing a password expiration error afterwards

error

It turns out that the SDDC controller VM performs its original connection to the SDDC utility VM using the root account but then uses a different account (backupuser) to perform the backup of the ESXi hosts bootbanks and it turns out that this backupuser account had expired.

I had to log into the SDDC manager utility VM and set the backupuser account to never expire using the command below

fix

Once this change was performed, i retried the bringup procedure and it completed successfully

vCF 2.2 is generally available

intro

VMware Cloud Foundation is now generally available and here are some of the  new enhancements :

Software updates 

vCF 2.2 now includes vSphere 6.5 Update 1, vSAN 6.6.1, NSX 6.3.3 and Horizon 7.0.2, it also includes Log Insight 4.3.0 as an optional component.  Note that vROPS has been removed from this version of vCF. This is a temporary measure and will be reversed in the future

software.PNG

 

Management Workload Domain updates

It it now possible to create a single management workload domain per vCF instance. You need a minimum of 4 servers (1 rack) for this configuration and it expands up to a maximum of 256 servers (8 racks). This means that you only have to allocate the first 4 servers in rack 1 for the management domain and the rest of the servers available in rack 1-8 are available for workload domains. This is a marked improvement over vCF 2.1.x where the first four nodes in each rack were allocated for a management domain

Deployment Types

Another new feature of vCF 2.2 is the ability to have compute workloads residing in the management domain. Workload domain isolation is provided by resource pools. This feature is targeted at smaller deployments , typically comprising of less than 32 servers. If you require more than 32 nodes, its recommended to adopt the traditional architecture of separate management and compute workload domains

Optimised SDDC Manager 

SDDC manager now only comprises of two VM’s, the SDDC manager controller and SDDC Manager Utility.  SDDC manager continues to provide NTP and DNS services to the vCF environment and DNS is provided in a HA fashion between the controller and the utility VMs

vc

HMS

The HMS (hardware management service) is now taken off the management switch and moved to the SDDC manager controller VM

Log Insight

Log Insight is now configured as a 3 VM cluster, containing one master node and two worker nodes. The system bring up workflow will configure logging for the management domain but if you require logging for your workload domains then you will need to enable this via the SDDC manager GUI and procure a license. Content packs available include vCF / vSPhere / vSAN / NSX and Horizon

LI

New features available in the SDDC manager GUI

It is now possible to add hosts and perform password rotation of all the components via the SDDC manager GUI. This is a significant improvement on previous versions as it was quite complex to perform these activities

new

Hardware updates 

The VMware Compability guide has been updated with server support for vendors like Lenovo / HDS and Fujitsu

Signed Certificate Support

It is now possible to use custom signed certificates on the SDDC deployed components (vCenter , PSC, NSX Manager , SDDC manage and Log insight). This is done using an automated cli tool located in /opt/vmware/cert-mgmt/vcfhelper.py. It is possible to use custom certificates on both the management domain and any workload domains

A step by step procedure on replacing these certificates can be found in the vCF 2.2 admin guide

Note : vCF 2.2 is targeted at greenfield sites only. If you need to upgrade from vCF 2.1.x to 2.2 then contact VMware Support for assistance

VxRack SDDC will be shipping with vCF 2.2 on 29th September 2017. If you want to upgrade your VxRack SDDC from 2.1.x then contact Dell – EMC support for assistance

All the documentation can be found here :

https://www.vmware.com/support/pubs/sddc-mgr-pubs.html

Issue Re-Commissioning a host in vCF

I wanted to decommission an esxi server from VMware Cloud Foundation and recommission it to test the procedure but i ran into an issue . Here are the steps that  i followed :

  1. Decommissioned  the host from SDDC manager. Once decommissioned then the esxi password defaults back to EvoSddc!2016

6

2. Re-imaged the host using the VIA and selecting the device type as “ESXI_Server”

7

3. Once the host was installed then i assigned it an IP of 192.168.100.70 (has to be in the range of 192.168.100.50 – 192.168.100.73)

4. . Checked that SSH was enabled and the firewall rules were set correctly (connections restricted to the 192.168.100.0/22 subnet)

5. SSH’d to the VRM (SDDC Manager) VM and edited the following file to reflect the BMC username and password

/home/vrack/VMware/vRack/server-commission.properties

6. Then attempted to run the recommission script

sudo /home/vrack/bin/server-commission.sh

I could see that the host was being picked up correctly but the recommission procedure was failing :

2

Then i looked in the vrack-vrm.log which is located in /home/vrack/vrm/logs and i was able to get more information as to what was causing the commissioning to fail. It seemed that the host was trying to mount an nfs datastore and the procedure was failing

8

Then i logged into the host using  the vSphere Client and could see that the host was trying to mount an NFS datastore from the LCM repository VM

1

I then examined this VM and could see that it didnt have any IP address and the nics were disconnected

5

I connected the NICS , rebooted the VM and it then showed valid IP addresses. I reran the host commission procedure and this time it succeeded

9

Once this had succeeded then i was able to see the host in the SDDC manager physical inventory and continue with the remaining steps to commission the host (Step 8 onwards)

https://docs.vmware.com/en/VMware-Cloud-Foundation/2.1.3/com.vmware.vcf.admin.doc_213/GUID-47B2C555-3184-40B5-BCA9-86033A717786.html

 

vExpert 2017

expert

I am delighted to be added to the vExpert team for 2017. The vExpert community is a great forum with some great benefits some of which include :

vExpert Program Benefits

  • Invite to our private #Slack channel
  • vExpert certificate signed by our CEO Pat Gelsinger.
  • Private forums on communities.vmware.com.
  • Permission to use the vExpert logo on cards, website, etc for one year
  • Access to a private directory for networking, etc.
  • Exclusive gifts from various VMware partners.
  • Private webinars with VMware partners as well as NFR’s.
  • Access to private betas (subject to admission by beta teams).
  • 365-day eval licenses for most products for home lab / cloud providers.
  • Private pre-launch briefings via our blogger briefing pre-VMworld (subject to admission by product teams)
  • Blogger early access program for vSphere and some other products.
  • Opportunity to receive a free blogger pass to VMworld US or VMworld Europe (limited to 50 for US and 35 for EU).
  • Featured in a public vExpert online directory.
  • Access to vetted VMware & Virtualization content for your social channels.
  • Yearly vExpert parties at both VMworld US and VMworld Europe events.
  • Identification as a vExpert at both VMworld US and VMworld EU

Here is a full list of the 2017 H2 vExperts :

https://blogs.vmware.com/vmtn/2017/08/vexpert-2017-second-half-announcement.html

 

 

VxRack SDDC – NSX Automation

nsxYou may have been wondering how much of the NSX installation and configuration process is handled by the automation workflows present in SDDC Manager.  Below is a bullet-ed list of the steps currently undertaken

  • Physical switch setup to support NSX (MTU / dedicated VLAN etc)
  • Deploy NSX Manager
  • Deploy NSX Contollers
  • Create IP Pools
  • Create a Transport Zone
  • Create a Logical Switch
  • Integrate with vROPS and Log Insight

These steps are performed for both the management domain and any workload domain(s) that are created. Having this work automated through an automated workflow saves a lot of setup time and is another reason why VxRack SDDC is a fantastic platform that will enable your IT department to bring up environments quicker and easier.

Once the above is completed then you are free to progress with further customization as required. You can deploy Edge Service Gateways , Logical Routers , Load balancers , distributed firewall rules etc and really start to use the powerful features of VMware NSX

 

VxRack SDDC workload domain deletion failed

Came across an issue recently where i was trying to delete a workload domain and the procedure failed.

error1

error2

After performing some troubleshooting and log analysis i found that the HMS process was not running on the Management switch

error4

To try and resolve the issue I followed the following procedure to restart the HMS process on the Dell Cumulus management switch and also restarted the vrm-tcserver service on the SDDC manager VM:

In order to get the credentials to log into any of the components, run the following command from the vrm VM.  This command will display all the component usernames and passwords

./vrm-cli.sh lookup-password

To restart the HMS process, ssh into the management switch using the credentials provided

error3

Then run the following command

for PID in $(ps -ef |grep HmsApp |grep -v grep|awk ‘{print $2}’)

do

kill $PID

done

service starthms.sh 

Once this is done, ssh to the vrm VM and restart the vrm-tcserver service

service vrm-watchdogserver stop 

service vrm-tcserver restart 

service vrm-watchdogserver start

Once this was complete then i attempted to delete the workload domain again and it was successful

error5