ESXCLI - Update Host DNS
Need to balance a vSphere cluster but don't have DRS?
If you ever find yourself in a position where you need to rebalance a cluster but don't have DRS, you know this is an annoyance at best and most likely a chore!
It's fairly straight forward. First get the vCenter and do no validation, connect to said vCenter, get a list of all hosts, then strip all DNS servers and add the ones desired back in.
What makes this ugly?
At this point, it only runs a fixed number of moves. This means if you say use 3 moves to balance the cluster, it will move three VMs no matter what.
As it stands, it's not task scheduler friendly as you likely will want to pass a few options and it doesn't connect to vCenter on its own.
This is a big one, this does not take into account CPU usage.
Let's start with the assessing the current state of the cluster.
$vmHosts = get-vmhost | Where-Object {$_.Name -notin $ExcludeHostList} | Sort-Object {$_.MemoryUsageGB / $_.MemoryTotalGB} -Descending
$avgHostMemUsedPercent = ($vmHosts.MemoryUsageGB | measure-object -sum).sum / ($vmHosts.MemoryTotalGB | measure-object -sum).sum * 100
Two lines, thats easy! Ok, it is a bit dense. Let's break this down. The first line get all the hosts connected to the vCenter and sorts them by percent of used memory.
The second line uses Measure-Object -Sum to get the cluster wide total memory and total used memory to get the cluster average used percent. This cluster wide average would be our ideal balanced state and will be what we compare against.
You may have also noticed the Where-Object {$_.Name -notin $ExcludeHostList} on the first line. This will be more relevant when we look at this as a function. It's used to filter out hosts if so desired.
Ok, now let's do some work!
$mostMemUsedHost = $vmHosts | select-object -First 1
$leaseMemUsedHost = $vmHosts | select-object -Last 1
$vmList = get-vm -Location $mostMemUsedHost | Where-Object {$_.Name -notin $ExcludeVMList -and $_.PowerState -eq 'PoweredOn'}
$moveScenarios = foreach ($vm in $vmList) {
$moveScore = [math]::abs((($leaseMemUsedHost.MemoryUsageGB +
($vm.ExtensionData.Summary.QuickStats.GuestMemoryUsage / 1024)) /
$leaseMemUsedHost.MemoryTotalGB) * 100 - $avgHostMemUsedPercent)
[PSCustomObject]@{
Source = $mostMemUsedHost
Destination = $leaseMemUsedHost
VM = $vm
MoveScore = $moveScore
}
Write-Verbose ("Calculated {0} from host {1} to host {2} with a move score of {3}." -f $vm,
$mostMemUsedHost, $leaseMemUsedHost, $moveScore)
}
Line by line (sort of):
Find the host with the most used memory where we want to move a VM from.
Find the host with the least used memory where we want to move a VM to.
Get a list of powered on VMs from the host with the most used memory so we can find the best one to move.
Using a loop iterating over each VM build a list of move scores
Host mem + VM active mem / total host mem = post move delta to average percent. Or to simplify, how far would this move leave the least used host from the ideal state.
Build a [PSCustomObject] with details about the potential move.
If we run the final function with the Verbose flag, write this move to standard out.
This is the core of the whole script. It takes the fullest host and the emptiest host, finds the best VM to move in order to get as close as possible to the average used memory percent. It's important to use percent here as clusters are often non-homogeneous meaning they may not have the same memory configurations.
With a list of possible moves and moves scores, it time to move some VMs.
$vMotion = $moveScenarios | Sort-Object MoveScore -Descending -Bottom 1
Write-Verbose ("Migrating {0} from host {1} to host {2} with a move score of {3}" -f $vMotion.vm.Name,
$vMotion.Source.Name, $vMotion.Destination.Name, $vMotion.MoveScore)
Try {
Move-VM -VM $vMotion.VM -Destination $vMotion.Destination -RunAsync | Out-Null
} Catch {
Write-Warning ("Failed to migrate {0} from host {1} to host {2}." -f $vMotion.vm.Name, $vMotion.Source.Name, $vMotion.Destination.Name)
}
do {
Write-Verbose "Waiting for vMotion to finish."
Start-Sleep -Seconds 5
$viEvents = Get-VIEvent -Start (Get-Date).AddSeconds(-30) -Finish (Get-Date)
$vMotionEvent = $viEvents | Where-Object {$_.FullFormattedMessage -like "*$($vMotion.VM.Name) was migrated from host $($vMotion.Source.Name)*"}
} until ( $vmotionEvent.count -gt 0 )
This part is relatively simple. We pick the $moveScenarios with the lowest score (delta from average), perform the move, and wait for it to finish. We do have to wait for this to finish because for each move we re-evaluate the memory loads to start with a clean and current state.
Line by line (again sorta, formatting is not my thing):
As I write this I realize that vMotion is not a great variable name for this but these are supposed to be ugly hacks, righ? ANYWAY, we pick lowest score move scenario.
More verbose output
Start the vMotion using a try/catch block because there are a lot of reasons for a vMotion to fail or just not be allowed.
Write a warning if the vMotion failed but keep going because we still want to keep going... And another potential bug, wait for it...
(do Loop) Check if the VM is done moving... The VM that might not be migrating at all. Best of all, there's no timeout! An ugly hack indeed!
Now with the core logic out of the way, let's see the whole function, bugs and all!
<#
.SYNOPSIS
Used for evacuating an ESxi hostMoves all VMs from a host to other hosts makeing some atempt to balance them.
.DESCRIPTION
Attempts to balance the cluster by moving VMs from the host with the highest use memory percent to the lowest used memory percent.
Moves are decided by a calcualted move score. This score is the post move delta percent from an ideal avarage. While this works well,
it does favor moving larger VMs as they will provide the largest imbalance correction per move.
.PARAMETER ExcludeHostList
A list of hosts that will not be involved in rebalancing.
.PARAMETER ExcludeVMList
A list of VMs that will not be considered for rebalancing.
.PARAMETER VMFilterScript
A script block that may be used to filter out additional VMs.
.PARAMETER MaxMoves
The number of VM moves to execute.
.EXAMPLE
PS C:\>Optimize-viMemBalance -MaxMoves 5
Allows all hosts and VMs to participart in 5 rounds of rebalancing.
.EXAMPLE
PS C:\>Optimize-viMemBalance -ExcludeHostList ESXi06.contoso.com -ExcludeVMList NVR01, NVR02 -MaxMoves 5
Rebalances all hosts except host ESXi06.contoso.com or VMs NVR01, NVR02.
.EXAMPLE
PS C:\>Optimize-viMemBalance -ExcludeHostList ESXi06.contoso.com -VMFilterScript {$_.Name -Like "NVR*"} -MaxMoves 5
Rebalances all hosts except host ESXi06.contoso.com or VMs whos names start with "NVR".
#>
Function Optimize-viMemBalance {
[CmdletBinding()]
param (
[Parameter()]
[string[]]$ExcludeHostList,
[Parameter()]
[string[]]$ExcludeVMList,
[Parameter(Mandatory)]
[int]$MaxMoves,
[Parameter()]
[ScriptBlock]$VMFilterScript
)
if ($global:defaultviservers.IsConnected -lt 1) {
Write-Error "$(Get-Date -Format 'M/d/yyyy h:mm:ss tt') $($MyInvocation.MyCommand) You are not currently connected to any servers. Please connect first using a Connect cmdlet."
break
}
# Build a list of vmware hosts in our cluster excluding the ones in $ExcludeHostList i.e. dts-esxhost6, our "stand alone" host. Then sort based on percent memeory used ({$_.MemoryUsageGB / $_.MemoryTotalGB})
$vmHosts = get-vmhost | Where-Object {$_.Name -notin $ExcludeHostList}| Sort-Object {$_.MemoryUsageGB / $_.MemoryTotalGB} -Descending
# get the avrage memory usage as a percent if all hosts were perfectlyt balanced. We use a percentage because not all hosts have the same memory and this makes it "fair"
$avgHostMemUsedPercent = ($vmHosts.MemoryUsageGB | measure-object -sum).sum / ($vmHosts.MemoryTotalGB | measure-object -sum).sum * 100
# init out move counter
[int]$moveCount = 0
do {
# Check if we hit maxMoves and break frm the loop if we have
if ($moveCount -ge $maxMoves) {break}
# Refresh the list after each move
$vmHosts = get-vmhost | Where-Object {$_.Name -notin $ExcludeHostList} | Sort-Object {$_.MemoryUsageGB / $_.MemoryTotalGB} -Descending
# IMPROVMENT need to calculate std dev and if move score is higher, we done, shut it down!
# Select the most utilized and least utilized hosts.
$mostMemUsedHost = $vmHosts | select-object -First 1
$leaseMemUsedHost = $vmHosts | select-object -Last 1
# Get a list of VMs on teh most utilized host that are powered on and not in $ExcludeVMList. We exclued the genarch servers as they are expensive to move and want to spread them out accross all the hosts manualy. We also dont need to move vms that are powered off.
$vmList = get-vm -Location $mostMemUsedHost | Where-Object {$_.Name -notin $ExcludeVMList -and $_.PowerState -eq 'PoweredOn'}
# If a fillter was included, apply that now
if ($VMFilterScript) {
$vmList = $vmList | Where-Object $FilterScript
}
# init our list of moves and add an entry for each VMs move score
$moveScenarios = foreach ($vm in $vmList) {
# Calculate moveScore; host mem + vm active mem / total host mem = post move delta to avarage percent. Lower moveScore means we get closer to the desired state of an average
$moveScore = [math]::abs((($leaseMemUsedHost.MemoryUsageGB + ($vm.ExtensionData.Summary.QuickStats.GuestMemoryUsage / 1024)) / $leaseMemUsedHost.MemoryTotalGB) * 100 - $avgHostMemUsedPercent)
[PSCustomObject]@{
Source = $mostMemUsedHost
Destination = $leaseMemUsedHost
VM = $vm
MoveScore = $moveScore
}
Write-Verbose ("Calculated {0} from host {1} to host {2} with a move score of {3}." -f $vm, $mostMemUsedHost, $leaseMemUsedHost, $moveScore)
}
# Select the best moveScenario (lowest move score) and init a vMotion
$vMotion = $moveScenarios | Sort-Object MoveScore -Descending -Bottom 1
Write-Verbose ("Migrating {0} from host {1} to host {2} with a move score of {3}" -f $vMotion.vm.Name, $vMotion.Source.Name, $vMotion.Destination.Name, $vMotion.MoveScore)
Try {
Move-VM -VM $vMotion.VM -Destination $vMotion.Destination -RunAsync | Out-Null
} Catch {
Write-Warning ("Failed to migrate {0} from host {1} to host {2}." -f $vMotion.vm.Name, $vMotion.Source.Name, $vMotion.Destination.Name)
}
# Monitor vCenter events and wait for the job to finish
do {
Write-Verbose "Waiting for vMotion to finish."
Start-Sleep -Seconds 5
$viEvents = Get-VIEvent -Start (Get-Date).AddSeconds(-30) -Finish (Get-Date)
$vMotionEvent = $viEvents | Where-Object {$_.FullFormattedMessage -like "*$($vMotion.VM.Name) was migrated from host $($vMotion.Source.Name)*"}
} until ( $vmotionEvent.count -gt 0 )
# Add 1 to the move counter
$moveCount ++
Write-Verbose $vMotionEvent.FullFormattedMessage
# Rinse and repeat until we hit maxMoves
} until ($moveCount -ge $maxMoves)
# IMPROVMENT Output a report of mem percent of each host
}
We added some options to ignore hosts, filter out VMs, and because this would run forever if we let it, a maxMoves limit.
How can we make this better?
We could look at each selected move score and compare it to the last and if it's within some range, call it done.
We could write a wrapper script that makes it easy to invoke from task scheduler. We could also use task scheduler to run this under a service account so we don't need to store creds anywhere other than the Windows credential store.
I tried adding CPU to the move score but I'm not a maths person and it hurt my brain.
Wrapper Script
I like to use a param block to set default options and if needed, I can call the script with "PS C:\>scriptName.ps1 -option value" to override.
My org also like to use the PS transaction logs so I add that in too.
BalanceCluster.ps1
# Script wrapper for the Optimize-viMemBalance function
# This sets WSD default options, imports the needed modules, and calls Optimize-viMemBalance
param (
[string]$viServer = 'vCenter',
[string[]]$ExcludeHostList = @('esxhost6.wsd3.ad'),
[string[]]$ExcludeVMList = @('GENARCH1','GENARCH2','GENARCH3'),
[int]$maxMoves = 1,
[scriptblock]$FilterScript = $null,
)
$myCommand = ([string]$MyInvocation.MyCommand).split('.')[0]
$logPath = "$PSScriptRoot\$myCommand-$(get-date -Format FileDateTimeUniversal).log"
Start-Transcript -Path $logPath
Connect-VIServer -Server $viServer
Optimize-viMemBalance -viServer $viServer -viCluster $viCluster -ExcludeHostList $ExcludeHostList -ExcludeVMList $ExcludeVMList -FilterScript $FilterScript -maxMoves $maxMoves -Verbose
if ($Log) { Stop-Transcript }
Now we can create a new scheduled task to run "powershell.exe C:\scripts\BalanceCluser.ps1" running as a service account and each time this runs, it will generate a PS transaction log at C:\Scripts\BalanceCluster-########T##########Z.log
I should note that this leave out packaging the function into a module and installing that module on the server. I'll leave that for another day.