Pages

Friday, June 29, 2012

VMware View Composer, Linked Clones, DHCP & DNS Issues

***Update***

With the release of View 5.1, floating (stateless) linked-clones now retain their MAC addresses unlike previous versions of View. In effect, this issue should no longer occur if you are running View 5.1 or newer.

The Scenario

You have an automated VMware View Composer floating linked-clone pool in an environment using Windows Active Directory DHCP and DNS, and whenever you remove or recompose linked-clone desktops it leaves the old (outdated) DNS records in place, and looking in DHCP you see more than one lease for the same computer name. These old records are considered "stale". The DHCP record will eventually drop off after the lease expires, but the DNS entry will persist even after the DHCP lease expires. WHY IS THIS HAPPENING?!?

The Why

I want to start this off by saying this problem doesn't seem to be 100% uniform across all VMware View Composer implementations. It seems to have to do with a given organization's/company's security configurations within AD, DHCP and DNS. With that said, below and even above, is what I witnessed while at a customer recently. Now, onto the explanation...

The above scenario is being caused in part because floating linked-clone pools do not retain their MAC addresses when destroyed, so they will get a new IP address when rebuilt. Yet the new VM is created using the same name as the previous VM and would need to be able to modify the existing DNS record that still persists. In some cases the applied security settings on the DNS server will prevent DNS records from being modified by just any account/client.  In the scenario I witnessed, the now stale DNS record that was created by the DHCP server lease of the original VM is still tied to that lease and the new lease does not get an additional DNS record (while it is possible to have multiple records with the same IP address, each record's NAME property must be unique), and in addition the new VM's computer account (SID) does not have any permission to modify the DNS record, so the DHCP Client cannot update the DNS record like using ipconfig /registerdns, which has no effect on the DNS record.
Note: In the described scenario you may see the old VM's computer account represented in the DNS object's security tab listed as something like "Account Unknown U-I-D-xxxxxx-xxxxxx-xxxxxxx-xxxxxx", which was originally listed as "COMPUTERNAME$" before the AD computer object's SID was changed by the newly created and joined VM. 
Here is the kicker: Even though the DHCP server is set to "Discard A and PTR records when lease is deleted" is checked, the DNS records are persisting after the DHCP lease expires when its time limit is reached.

Additional Points

If no DNS record currently exists for the computer name, then a DNS record gets created when the computer gets a DHCP lease. This works for brand new, never existed desktops being added for the first time. This also works if you manually delete both the DHCP lease AND DNS record, and then renew an IP, but that is not a viable solution in an automated desktop pool world... especially if we are talking about recomposing hundreds of desktops at a time.

One other note, as I found out from the referenced blog post below, if both the DHCP and DNS records are up to date and not stale, performing an ipconfig /release will in fact delete both the DHCP reservation and cleanup the DNS A & PTR records. So I bet you can already guess the solution!

The Resolution

While VMware View Composer does not currently handle this situation automatically (as of View 5.1 anyway), there is a simple fix you can implement easily to resolve the issue. I originally found this solution on this blog post, from more than 1 year ago, down in the comment by "Heather" (wish I knew more about her to thank her for this!). Her solution to this problem was to perform an "ipconfig /release" from the View desktop VM as a shutdown script. She also notes that you need to have the desktop wait for a few seconds or the command runs for zero seconds and it doesn't seem to process successfully.

So create a batch file that looks like the following:
@echo off 
ipconfig /release 
timeout /t 5

Once you have this file created you can place it anywhere that the computer would have access to it. I would recommend placing it in the NETLOGON share on a domain controller for both access and replication across all domain controllers.

You then need to setup linked clone desktops to run this shutdown script in one of several ways:

  • Create an AD GPO with the shutdown script and apply it to the OU that contains all of your linked clones (recommended)
  • Edit the local computer GPO (gpedit.msc) of the parent image to add the shutdown script, create a snapshot and use this for all of your linked clone pools (this allows more granularity, but leaves room for error)
  • Edit each linked clone pool's settings and add the shutdown script to the "Power-off script" field of the Guest Customization tab (again, more granular, but you will need to remember to apply this to any future linked clone pools you create leaving room for error)

The Bottom Line

While this solution worked for the scenario I was faced with, it may not solve all issues that are similar to this. You will definitely need to adequately test any solution before you put it into production, and for goodness sake, TAKE NOTES!!! ;-)

6 comments:

  1. This is exactly what we are experiencing, to add to this we are using VMWare Persona Management, we would need to ensure that the desktop has sync'd its persona changes before this shutdown script releases its IP otherwise we would have further issues
    ~thanks.. very useful

    ReplyDelete
  2. Denis Palmer - What version of View are you running? This should have been resolved in version 5.1 and newer. As far as making sure it runs *after* View Persona Management syncs up, that is a tough one. If that doesn't happen before this script would run, you could add a sleep command and try to figure howmany seconds to give it. But doing this could lengthen the time it takes to release that desktop. The biggest drawback here is that user is not able to get a new desktop until the logoff process completely finishes. I have a customer that is using View Persona Management and it adds quite a bit of time to both the login and especially logoff. When a user connects to a desktop and there is a problem with it (usually printer mapping script didn't execute properly) they logoff and try to get to a new desktop. Because of how long it takes for their View Persona Management to sync at logoff, they tell me it can be as much as 30 minutes before that user can get a new desktop. Of course, this is indicative of a bigger problem with their View Persona Management implementation, but you get my meaning.

    BTW, I am no longer at VMware and now work for an amazing company called Nutanix. I still help customers with VMware questions as much as possible, but I no longer have direct access to the EUC Business Unit or Engineers at VMware. ;-)

    ReplyDelete
  3. Thanks for this post Tim. Very helpful!

    ReplyDelete
  4. AD GPO with the shutdown script worked great for me. Thanks for the helpful post!

    ReplyDelete
    Replies
    1. Glad this helped you Tim Shaw! That's why I wrote it.

      It's shocking to me that it is still useful after all this time. ;-)

      Delete