Monday, November 7, 2011

VCAP-DCA: The wait is over...

Finally, after 17 days of anxious anticipation, the exam results are in...  and... wait for it... I PASSED!!!  w00t w00t!  I needed a 300 to pass and I scored 358 out of 500.  Not the best score in the world but considering I decided to just "try it" and only gave myself two weeks to study, I couldn't be happier!  Now, if they would just hurry up and send my ID so I know what number I am...

Update:  Number 474!

Poor performance after upgrading from vCenter 4.1 to 5

I recently upgraded a VMware vCenter installation from 4.1 to 5.0.  Everything went well during install from what I remember but afterward I noticed that things kept timing out when managing the vSphere environment.  Upon further investigation I found that java.exe seemed to be the culprit.  I tried updating it, tried to remove VMware Update Manager, tried simply shutting down the vCenter and SQL services, but none of that helped.  The CPU was still pegged at 100%.  At that point I started shutting down services one at a time, starting with anything VMware related.  Within a few minutes I found that the Converter services, which were installed in the 4.1 environment but not updated, were the most likely culprits.  I removed the programs related to Converter and CPU dropped to 0%.  At this point I decided to reboot and ensure things were sane.

Unfortunately, after the reboot, the java.exe CPU issue had returned.  I then started the troubleshooting process again but this time with a little help.  I downloaded and ran an old favorite of mine:  ProcessExplorer from Microsoft/Sysinternals.  This let me see more information about each process such as what the running environment looks like and which process is considered the parent.  There were three separate java.exe's running but the most interesting one was launched by the vSphere Web Client services.

I went ahead and removed the Web Client.  While I was at it, I also removed the new Dump Collector and the Syslog Collector services.  They each prompted for a restart so I did and checked CPU usage again when it came back up.  Still at 100% with another java.exe fighting sqlserver.exe for highest allocation.  This time it was the Inventory Service.  Even though it was pegged, it was still more responsive than usual so I feel like I am making some progress at least.  Back to Process Explorer...

While I was looking at some articles about this to understand what it did, I found that the CPU returned to a sane state.  This must have been the Inventory Service catching up after the reboot.  I'm still seeing sqlserver.exe running at a pretty constant 50% (essentially the equivalent of 1 CPU core) but I would half way expect that.  I also see Tomcat spike up occasionally and together with SQL consume 100% CPU.  At this point I am going to continue to monitor the situation.  If I find anything else I will try to post.

Just for reference, the vCenter server in question is a VM with 2 vCPU and 8GB RAM running on FC storage managing 6 hosts with approximately 60 VM's so I would assume based on the recommended specs that this would be sufficient.  I may increase the memory to 12GB to test further though.  Maybe the next step is to finally separate out the SQL instance to a dedicated server since we are technically over the supported limit of 5 hosts and 50 VM's (although I think that is more of a size restriction but who am I to argue with VMware best practices).

*UPDATE:  Well, I feel like an uber douche.  I was getting ready to call it quits when I saw some alerts coming in from one of our monitoring tools saying that our newest host, the one where the vCenter server ran, was swapping memory in and out frequently.  I started looking and found a fair amount of memory ballooned and swapped.  This was totally unexpected because that host has 120GB of RAM and only about 60GB was actually in use.  Upon further investigation I found that several VM's had limits set on their memory, including (you guessed it) the vCenter server.  It was actually capped at 2GB!  Not sure why this would have ever been set in this environment but it has been addressed and resources look great again.  Java hates being low on physical RAM.  CPU is still getting hammered on by SQL but total commit on memory is back down to less than 4GB.

References:

Installing vCenter Server 5.0 best practices
http://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=2003790

Upgrading to vCenter Server 5.0 best practices
http://kb.vmware.com/selfservice/documentLinkInt.do?micrositeID=&popup=true&languageId=&externalID=2003866

Minimum requirements for the VMware vCenter Server 5.x Appliance
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2005086

vSphere 5 ‘s new services in vCenter
http://geeksilver.wordpress.com/2011/08/31/vsphere-5-s-new-services-in-vcenter/

Update management service at 100% CPu after patching to 4.1.0 build 345043
http://communities.vmware.com/thread/306584