Monday, February 09, 2009

Impact of ESX snapshot backups on Microsoft database servers

Up until Update 2, ESX 3.5 uses a VMWare tool called the Sync Manager as part of the snapshot backup process. The Sync manager quiesces the file system (pauses all incoming I/O requests and dumps dirty data to disk) before the snapshot backup is taken. This allows the backup to file-system consistent.

If you were to take a snapshot without pausing the I/O requests and then restore the snapshot, the virtual machine will start up for the first time thinking that it is recovering from a power failure type of crash. This is because the recovered system will not be able to find the I/O requests that were stored in the memory (RAM) at the time the snapshot was taken.

What does this have to do with database servers? For servers housing databases (Active Directory, Exchange, or SQL servers), stopping I/O requests with the Sync manager halts incoming requests to the database without notifying the database of what is happening. The database is waiting on that information to arrive � so when the data doesn't come in when expected, errors are logged to the event log and in some cases, the databases become corrupt. I happened to find this out on an active directory domain controller, and the sequence of errors looks like this:

First, you'll see event ID 1 logged with source LGTO_Sync. This is the Sync driver starting to do its work quiescing the file system.

On domain controllers, AD requests will begin to fail. The description will differ based on the request, but the Event ID stays the same.

For domain controllers running DNS, dynamic updates will fail as well:

For DHCP servers, you will see this:

On Exchange servers, you'll see Autodiscover errors:

If you are seeing these errors, stop using the sync manager now. Eventually you will corrupt your database.

Workaround
Stop quiescing guest database servers before taking snapshots, and start adding snapshots of virtual machine memory to your backups. Most backup applications allow you to do this. If yours doesn't, you can script it using vimsh.

Example:
vimsh -n -e "vmsvc/createsnapshot [VmId] [snapshotName] [snapshotDescription] [includeMemory]"

vimsh -n -e "vmsvc/createsnapshot XXXX FIRST_SNAPSHOT MY_FIRST_SNAPSHOT_1"

By taking a snapshot of the guest machine's memory, you are creating a full snapshot. When you restore, you restore the memory on top of the file system. When the machine starts, it will be able to access all the necessary information in memory to start normally - no crash.

Resolution
To combat the Sync Manager problem, ESX has released update2, which includes an ESX VSS tool that integrates with Microsoft VSS. It works by using the windows operating system to hold I/O requests, eliminating the need for the sync driver. When the operating system is in charge of halting its own I/O activity, the databases are notified that a backup is taking place. The databases can then pause their own processing of requests, and no errors occur.

This update is relatively new, and many third-party backup applications do not support update 2 yet, which is why I have offered the workaround here.

One last note about Microsoft domain controllers and Vmware snapshot backups
In 2006 (revised in Dec. 2008), Microsoft released KB 888794 (http://support.microsoft.com/kb/888794/en-us), which states that

"Active Directory does not support any method that restores a snapshot of the operating system or the volume the operating system resides on. This kind of method causes an update sequence number (USN) rollback. When a USN rollback occurs, the replication partners of the incorrectly restored domain controller may have inconsistent objects in their Active Directory databases. In this situation, you cannot make these objects consistent. "

In reality, the BURFLAGS registry referred to in Microsoft KB290762 (http://support.microsoft.com/kb/290762) can be set so that the virtual DCs are nonauthoritative, and an existing domain controller can be set to authoritative. This will allow the USN to be overwritten by the authoritative domain controller, and no USN rollback will occur.

Labels: , , , ,

Monday, January 26, 2009

Creating a Virtual iSCSI SAN under ESX!







Important step here. ADD a hard drive to this config with the needed storage space and map it to SCSI 1:0 to ensure the rest works!


Launch the CMC Console, and choose to Configure RAID. Choose RAID (virtual) and it will configure your disk.


Create a management group:






Set a NTP server:




I chose a standard cluster for now.





Name your cluster:





Configure VIP:




Creating a test 30GB Volume:




If you end up waiting here and re-seeing this screen, launch the VSAN console and log in using your new creds.




And we are DONE:




Now, it's time to add a server that will attach to this.


On the Servers Tab, click New Servers:




Now, I am using a Windows 2008 host for this. If you use anything from pre-vista/2008, you need to install the iSCSI initiator from:
http://www.microsoft.com/downloads/details.aspx?familyid=12cb3c1a-15d6-4585-b385-befd1319f825


In 2008 and Vista , it is included in the control panel:




Makes good sense:





An invaluable resource for decoding this:




Since this is for LAB only, I went with a secret-less config:





After adding the server, we can assign the volume:









Once added, you can go to targets:




Choose log on:




Launch Disk Management, and you will see:




From here, it's simple - format the disk, and you can use it!


I should note at this point that 2008UTILITY is actually on Hyper-V, so I am actually having this traffic transverse multiple machines!

One thing to add.. Any issues with VSA, check here first:http://vsaforum.lefthandnetworks.com/

Labels: ,

Sunday, January 11, 2009

ESX for home for UNDER $400!

I've been watching for ESX compatible hardware to get really cheap. I think I finally got the right mix.


Dude, I'm getting a Dell!


http://www.pacificgeek.com/product.asp?id=104457


$250 shipped.


+ $50 for 4gb of RAM

+ free two 500GB drives I have


It's on the ESX whitebox list here:

http://www.vm-help.com/esx/esx3.5/Whiteboxes_SATA_Controllers_for_ESX_3.5_3i.htm


Blog post regarding:

http://www.jasemccarty.com/blog/2008/07/vmware-esxi-optiplex-620.html


Labels: ,