vSphere 6.7 Home Lab – Part 2 – vSAN 6.7 U3 2-Node Hardware

So heaps has happened in my home lab over the past 18 months, where not only am I now running ESXi 6.7 U3, I also have an extra host with the same specs as the first, with both now acting in a 2-Node vSAN configuration.

Continue reading “vSphere 6.7 Home Lab – Part 2 – vSAN 6.7 U3 2-Node Hardware” →

“Cannot validate Manager Service host. The remote certificate is invalid according to the validation procedure” error when installing vSphere Proxy Agent

When attempting to install a new vRA vSphere Proxy Agent whilst on a customer engagement, I encountered the below error when testing the connection to the Manager Server Host and Model manager Web Service Host VIPs during the Install Proxy Agent wizard:

“Cannot validate Manager Service host. The remote certificate is invalid according to the validation procedure.”

Continue reading ““Cannot validate Manager Service host. The remote certificate is invalid according to the validation procedure” error when installing vSphere Proxy Agent” →

vSphere 6.7 Home Lab – Part 1 – The Parts and Build

So after many years putting it off I have decided to invest in building a decent home lab for testing basically everything VMware related. I spent the past few weeks researching the parts I should get and in the end, and after some help from my colleagues Tai Ratcliff and Askar Kopbayev, I finally settled on the Super Micro X10SDV-6C-TLN4F-O Motherboard as my starting point, as it can hold up to 128 GB RAM (Which is going to be important when spinning up all of the vRealize Suite).

Continue reading “vSphere 6.7 Home Lab – Part 1 – The Parts and Build” →

vSAN Health Check – All hosts contributing stats warning

Had a tricky issue with the vSAN Health check plugin recently, where a warning was reporting issues with some hosts not contributing vSAN performance stats.

All hosts contributing stats

Continue reading “vSAN Health Check – All hosts contributing stats warning” →

Auto Deploy Stateful Won’t Install

Interesting problem we encountered on a recent project using Auto Deploy and stateful caching where the ESXi images just wouldn’t stick (i.e install) onto local disk. Using vCenter 6.5e and ESXi 6.5d, the auto deploy boot would work fine, however the install would not persist to local disk.

Seems it is a problem with the host profile being used for the Auto Deploy rule…so to workaround the issue what we did was create a seperate host profile just for the initial boot of the server with auto deploy (i.e. the host profile listed in the Auto Deploy rule), with everything unchecked in the profile except for “System Image Cache Configuration” as shown in the below image:

Stateful Install Host Profile

Once it was booted and added into vCenter, a seperate profile was then applied to configure the remaining VDS settings, Advanced Properties etc.

Upgrade vROps 6.3/6.4 – Enable Actions “Failed to create AI resource.”

So I encountered an interesting problem the other week when upgrading a customer’s vROps environment from version 6.0.2 to 6.4. During the upgrade process, it appears as though the existing Python Action adapter instances were not automatically updated to the new vCenter adapters action setting, that changed in version 6.3.

What I found is when trying to manually “Enable” Actions within the vCenter adapter (under Manage Solution) I would get the following error: “Failed to create AI resource. Resource with same key already exists.”:

failedai

After reaching out to the team internally, I found that this is due to the original python actions adapter still existing even after the upgrades, so the solution was to manually remove it. As the python actions adapter was originally a Solutions adapter, and post-6.3 the “Python Actions Adapter” solution no longer exists, this meant to remove it I needed to use the REST API.

There is already a KB article detailing how to do this found here, however when attempting to run the curl command on the vROps master node appliance, it resulted in a strange error:

curl: (35) error:14077458:SSL routines:SSL23_GET_SERVER_HELLO:reason(1112)

After some searching I came across this post, which then pointed me to a curl bug (which in this case was with ubuntu) that seemed to be related to the issue I was having.

So I tried to run the same curl command with a more up to date version of curl (> 7.40) and it worked perfectly, and I was then able to remove the old python action adapter instances, following by then successfully selecting “Enable Actions”.

enableaction

Hopefully this helps someone else!

“The number of vSphere HA heartbeat datastores for this host is 0” warning when only using VSAN Storage

When configuring a VSAN cluster, it is recommended to disable heartbeat datastores in your cluster, as this ensures that if only the VSAN network fails, vSphere HA will still restart the VMs on another host in the cluster (more info on why the heartbeat datastore should be disabled can be found in the VSAN Stretched Cluster guide here).

Now, when datastore heartbeats are disabled on your cluster, you may then see the following warning message on your hosts:

heartbeat0warning

This is because vSphere HA requires a minimum of two shared datastores between all hosts in a vSphere HA enabled cluster for heartbeat detection (more info at the following VMware KB: https://kb.vmware.com/kb/2004739)

So if the only shared storage available is VSAN, then you may want to remove this warning. To do that:

In the vSphere Web Client, right click your cluster and select Settings
Under vSphere HA go to Edit
Under Advanced Options add the following Configuration Parameter: das.ignoreInsufficientHbDatastore
For its value, enter true
Disable then re-enable HA for your cluster to apply the changes.

Of course if you don’t want VMs to fail over to another host in your cluster in the event the VSAN network is unavailable, then you will need to configure another non-VSAN datastore to use for heart beats.

Local Scratch Partition Missing after Upgrade

I came across a minor but annoying problem today after performing ESXi upgrades in my lab using VUM. Each of my ESXi hosts have a local 8GB boot disk (non-flash), and at install a local 4GB FAT32 partition was created automatically for scratch. However for some reason after upgrading the hosts to 6.0 U2 (from U1) this partition was removed for some of my hosts (but not all), therefore showing this ugly warning message in the web client for the hosts:

system logs are stored on non-persistent storage

The fix was simple. In the web client:

Create a new VMFS datastore on the boot disk (I made it 3 GB). E.g. “esx-01b_local”
Create a new directory on the datastore for scratch. E.g. “.locker-esx-01b”
Change the advanced property “ScratchConfig.ConfiguredScratchLocation” to the VMFS volume that was created. E.g. “/vmfs/volumes/esx-01b_local/.locker-esx-01b”
Reboot the host (obviously make sure you’re in maintenance mode first!).

The reasoning for this occuring I will leave to the fact that my lab is running in a nested environment, which caused the ESXi upgrade process to be very slow (each ESXi upgrade took almost 30 minutes to complete).

Please note that this applies to local non-flash storage. If you are installing ESXi on Flash/SD cards, the above fix is not recommended due to the fact that Flash and SD cards can have potentially limited read/write cycles available (and having scratch on them can wear them out pretty quickly). In this case, you should use either a seperate local datastore that is on a HDD/SSD (but not VSAN, as VSAN is not supported), or a remote syslog server.

Isolating vSphere Replication NFC Traffic – Part 2

Now for the exciting conclusion on how to isolate your vSphere Replication and NFC traffic! (for Part 1 click here)

Note: There is a requirement to modify the vSphere Replication appliances directly to configure the static networks on each new vnic. Making these changes requires modification of config files on the appliance, which is not officially supported so do this at your own risk!

Continue reading “Isolating vSphere Replication NFC Traffic – Part 2” →

Isolating vSphere Replication NFC traffic – Part 1

For part 2 click here!
As many of you would already know, vSphere Replication 6.0 introduced the ability to isolate the replication traffic from all other traffic in your datacentre. For many security conscience organisations, this was a big deal…they can now ensure that replication traffic is not routed the wrong way.

This is fine for traffic between the source site and target, however what isn’t entirely clear is how to also isolate the NFC traffic between the vSphere Replication Server appliance and ESXi hosts. Continue reading “Isolating vSphere Replication NFC traffic – Part 1” →

bmanone.com

Virtualization & Cloud Automation

Category: Virtualization