So heaps has happened in my home lab over the past 18 months, where not only am I now running ESXi 6.7 U3, I also have an extra host with the same specs as the first, with both now acting in a 2-Node vSAN configuration.
When attempting to install a new vRA vSphere Proxy Agent whilst on a customer engagement, I encountered the below error when testing the connection to the Manager Server Host and Model manager Web Service Host VIPs during the Install Proxy Agent wizard:
“Cannot validate Manager Service host. The remote certificate is invalid according to the validation procedure.”
So after many years putting it off I have decided to invest in building a decent home lab for testing basically everything VMware related. I spent the past few weeks researching the parts I should get and in the end, and after some help from my colleagues Tai Ratcliff and Askar Kopbayev, I finally settled on the Super Micro X10SDV-6C-TLN4F-O Motherboard as my starting point, as it can hold up to 128 GB RAM (Which is going to be important when spinning up all of the vRealize Suite).
Had a tricky issue with the vSAN Health check plugin recently, where a warning was reporting issues with some hosts not contributing vSAN performance stats.
Interesting problem we encountered on a recent project using Auto Deploy and stateful caching where the ESXi images just wouldn’t stick (i.e install) onto local disk. Using vCenter 6.5e and ESXi 6.5d, the auto deploy boot would work fine, however the install would not persist to local disk.
Seems it is a problem with the host profile being used for the Auto Deploy rule…so to workaround the issue what we did was create a seperate host profile just for the initial boot of the server with auto deploy (i.e. the host profile listed in the Auto Deploy rule), with everything unchecked in the profile except for “System Image Cache Configuration” as shown in the below image:
Once it was booted and added into vCenter, a seperate profile was then applied to configure the remaining VDS settings, Advanced Properties etc.
So I encountered an interesting problem the other week when upgrading a customer’s vROps environment from version 6.0.2 to 6.4. During the upgrade process, it appears as though the existing Python Action adapter instances were not automatically updated to the new vCenter adapters action setting, that changed in version 6.3.
What I found is when trying to manually “Enable” Actions within the vCenter adapter (under Manage Solution) I would get the following error: “Failed to create AI resource. Resource with same key already exists.”:
After reaching out to the team internally, I found that this is due to the original python actions adapter still existing even after the upgrades, so the solution was to manually remove it. As the python actions adapter was originally a Solutions adapter, and post-6.3 the “Python Actions Adapter” solution no longer exists, this meant to remove it I needed to use the REST API.
There is already a KB article detailing how to do this found here, however when attempting to run the curl command on the vROps master node appliance, it resulted in a strange error:
curl: (35) error:14077458:SSL routines:SSL23_GET_SERVER_HELLO:reason(1112)
So I tried to run the same curl command with a more up to date version of curl (> 7.40) and it worked perfectly, and I was then able to remove the old python action adapter instances, following by then successfully selecting “Enable Actions”.
Hopefully this helps someone else!
When configuring a VSAN cluster, it is recommended to disable heartbeat datastores in your cluster, as this ensures that if only the VSAN network fails, vSphere HA will still restart the VMs on another host in the cluster (more info on why the heartbeat datastore should be disabled can be found in the VSAN Stretched Cluster guide here).
Now, when datastore heartbeats are disabled on your cluster, you may then see the following warning message on your hosts:
This is because vSphere HA requires a minimum of two shared datastores between all hosts in a vSphere HA enabled cluster for heartbeat detection (more info at the following VMware KB: https://kb.vmware.com/kb/2004739)
So if the only shared storage available is VSAN, then you may want to remove this warning. To do that:
- In the vSphere Web Client, right click your cluster and select Settings
- Under vSphere HA go to Edit
- Under Advanced Options add the following Configuration Parameter: das.ignoreInsufficientHbDatastore
- For its value, enter true
- Disable then re-enable HA for your cluster to apply the changes.
Of course if you don’t want VMs to fail over to another host in your cluster in the event the VSAN network is unavailable, then you will need to configure another non-VSAN datastore to use for heart beats.
I came across a minor but annoying problem today after performing ESXi upgrades in my lab using VUM. Each of my ESXi hosts have a local 8GB boot disk (non-flash), and at install a local 4GB FAT32 partition was created automatically for scratch. However for some reason after upgrading the hosts to 6.0 U2 (from U1) this partition was removed for some of my hosts (but not all), therefore showing this ugly warning message in the web client for the hosts:
The fix was simple. In the web client:
- Create a new VMFS datastore on the boot disk (I made it 3 GB). E.g. “esx-01b_local”
- Create a new directory on the datastore for scratch. E.g. “.locker-esx-01b”
- Change the advanced property “ScratchConfig.ConfiguredScratchLocation” to the VMFS volume that was created. E.g. “/vmfs/volumes/esx-01b_local/.locker-esx-01b”
- Reboot the host (obviously make sure you’re in maintenance mode first!).
The reasoning for this occuring I will leave to the fact that my lab is running in a nested environment, which caused the ESXi upgrade process to be very slow (each ESXi upgrade took almost 30 minutes to complete).
Please note that this applies to local non-flash storage. If you are installing ESXi on Flash/SD cards, the above fix is not recommended due to the fact that Flash and SD cards can have potentially limited read/write cycles available (and having scratch on them can wear them out pretty quickly). In this case, you should use either a seperate local datastore that is on a HDD/SSD (but not VSAN, as VSAN is not supported), or a remote syslog server.
Now for the exciting conclusion on how to isolate your vSphere Replication and NFC traffic! (for Part 1 click here)
Note: There is a requirement to modify the vSphere Replication appliances directly to configure the static networks on each new vnic. Making these changes requires modification of config files on the appliance, which is not officially supported so do this at your own risk!
For part 2 click here!
As many of you would already know, vSphere Replication 6.0 introduced the ability to isolate the replication traffic from all other traffic in your datacentre. For many security conscience organisations, this was a big deal…they can now ensure that replication traffic is not routed the wrong way.
This is fine for traffic between the source site and target, however what isn’t entirely clear is how to also isolate the NFC traffic between the vSphere Replication Server appliance and ESXi hosts. Continue reading “Isolating vSphere Replication NFC traffic – Part 1”