Beyond SQL on VMware Best Practices

Going beyond SQL in VMware Best Practices has involved settings and configurations that are never mentioned in the SQL Server On VMware Best Practices Guide. Working in a healthcare environment means working with a large EMR (Electronic Medical Record) provider like EPIC. In tuning one of their many large sql databases they had some recommendations. Some of these are throwbacks to my old days of tweaking performance out of a desktop with settings like TCP parameter KeepAliveTime. How many of you go beyond the recommendations in the SQL Best Practice Guide?

floating book in quaint library
Knowledge shared becomes wisdom attained (Photo by Jaredd Craig on Unsplash)

I’ve listed here some of the settings that have been suggested for our larger more accessed and just plain busy sql servers:

ESXi Settings

Adjust the Round Robin IOPS limit from the default 1000 to 1 on each database LUN. Refer to VMware KBA 2069356 for more information on setting this parameter. (We already utilize Round Robin but each lun was set to the default)

Why would you want to make this change?
“The default of 1000 input/output operations per second (IOPS) sends 1000 I/O down each path before switching. If the load is such that a portion of the 1000 IOPS can saturate the bandwidth of the path, the remaining I/O must wait even if the storage array could service the requests. The IOPS or bytes limit can be adjusted downward allowing the path to be switched at a more frequent rate. The adjustment allows the bandwidth of additional paths to be used while the other path is currently saturated. “

How to make this change:

In ESXi 5.x/6.x:
for i in esxcfg-scsidevs -c |awk '{print $1}' | grep naa.xxxx; do esxcli storage nmp psp roundrobin deviceconfig set –type=iops –iops=1 –device=$i; done

Where, .xxxx matches the first few characters of your naa IDs.
 
To verify if the changes are applied, run this command:

esxcli storage nmp device list
 
You see output similar to:
 
Path Selection Policy: VMW_PSP_RR
Path Selection Policy Device Config: {policy=iops,iops=1,bytes=10485760,useANO=0;lastPathIndex=1: NumIOsPending=0,numBytesPending=0}
Path Selection Policy Device Custom Config:
Working Paths: vmhba33:C1:T4:L0, vmhba33:C0:T4:L0

Registry Settings

Configure Windows TCP Parameters in the Registry

The default setting for the Windows TCP parameter KeepAliveTime is two hours. This setting controls how often TCP sends a keep-alive packet to verify that an idle connection is still intact. Reducing it from two hours to five minutes helps windows detect and clean up stale network connections faster.


How to make this change:

Use regedit to create the DWORD KeepAliveTime (if it does not currently exist) at HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters\ . Then modify the value to 300000 (time in milliseconds).

The default setting for the Windows TCP parameter TCPTimedWaitDelay is four minutes. This setting determines the time that must elapse before TCP/IP can release a closed connection and reuse its resources. By reducing the value of this entry, TCP/IP can release closed connections faster and provide more resources for new connections.

How to make this change:

Using regedit to create the REG_DWORD TCPTimedWaitDelay (if it does not currently exist) at HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters\ . Then modify the value to 30 .

What kind of ESXi settings, Windows registry settings, config file changes, etc. do you implement in your environment that goes beyond the SQL Server On VMware Best Practices Guide ? As always, I look forward to your comments and sharing of knowledge.

Deleting Orphaned Virtual Desktops In VMware View

If you’ve ever managed a VMware View vdi environment for a period of time sooner or later you will have to manually delete orphaned virtual desktops.  Although VMware provides KB 1008658 that explains this procedure.  It is lacking in clarity especially for first time VMware View admins.

As we all know our friend Google provides if you only ask.  I have found 2 other blogs that do very good job of taking KB 1008658 and parsing it down to a more concise version.  My intention was to do this myself but why reinvent the wheel when you can just pay homage to it.

Here are the 2 blog post links:

http://terenceluk.blogspot.com/2013/02/manually-deleting-orphaned-andor-stale.html

http://luckyblogshere.blogspot.com/2011/06/removing-vms-from-adam-database.html

 

The summarized steps for deleting an orphaned virtual desktop in VMware View is:

  1. Stop provisioning on the offending vdi pool (optional but my experience is that it is essential especially with very busy non-persistent pools with)
  2. Remove orphaned virtual desktop from ADAM database
  3. Remove all relevant entries for the orphaned vdi in the SQL Composer database
  4. Delete corresponding computer object out of Active Directory
  5. Enable provisioning once again on the pool

Please see the other blog posts for exact details.

Hopefully seeing more than one example really helps in understanding the necessary steps.

 

Note:  Edited 9/29/17 to remove a broken link