VMware 5.1 DISA STIG and Hardening Guide PowerCLI

I had the challenge of applying the new hot off the press DISA STIG 5.1.  This goes hand in hand with VMware 5.1 hardening guide.  In fact those are the resources that I used to craft the powershell script.  The VCenter and ESXi STIG could help some powershell love too but I want to focus on the vm stig because this is the most time consuming if not scripted.  First and foremost you need to download and install the latest VMware PowerCLI.  I created a file in D:\vmware stig\stig_vm.txt and put the following fixes extracted from reading the DISA STIG and Hardening guide:


From “Windows PowerShell ISE” Issue the following command first to extend powershell’s capability:

Set-ExecutionPolicy RemoteSigned
Add-PsSnapin VMware.VimAutomation.Core
Okay, we are now ready for the meat and potato:.

$stig_vm = Import-Csv  ‘D:\VMWARE STIG\stig_vm.txt’ -Header Name,Value


foreach ($line in $stig_vm) {
New-AdvancedSetting -Entity MY_VM1 -Name ($line.Name) -value ($line.value) -Force -Confirm:$false | Select Entity, Name, Value


foreach ($line in $stig_vm) {
Get-VM | New-AdvancedSetting -Name ($line.Name) -value ($line.value) -Force -Confirm:$false | Select Entity, Name, Value


Get-VM | Get-AdvancedSetting -Name  “isolation.tools.autoInstall.disable”| Select Entity, Name, Value

Get-VM MY_VM1 | Get-AdvancedSetting | Select Entity, Name, Value

Get-VM | Get-AdvancedSetting | Select Entity, Name, Value

Unfortunately the script still takes a while to complete and this is a known PowerCLI issue, but it is still 100 times faster than doing it manually.  There’s more things I can add to to the script but this is just a quick and dirty post.  You can pipe the output to a csv file as well, you can also make one for a monthly auditing purposes.


VMware View Single Sign On Timeout -1

Note:  Please do not confuse VMware View SSO with VCenter SS0, they are not the same.

VMware View single sign on (SSo) is enabled by default which is excellent.  The bad thing is that the default timeout setting is set to infinite which is very insecure.  Not having a timeout setting by default means a bad guy could go behind you while you’re taking a bathroom break and back out to “Desktop Library” and choose a different vm that you are entitled and SSO will take the bad guy in without prompting for credential.

The fix is documented since 4.6 release http://pubs.vmware.com/view-50/index.jsp?topic=/com.vmware.view.administration.doc/GUID-DB5C245D-AD48-4598-A7C6-C8FC75FC3339.html

Multiple site shows you how to use Windows built in tools ADSI EDIT but I could see why few folks still gets lost using the instruction.  If you follow this instruction, I can guarantee success.

1. Open ADSI EDIT from one of the clustered View Connection Server.    Tip: c:\windows\system32\adsiedit.msc

Without any spaces type “cn:common,ou=global, ou=properties,dc=vdi,dc=vmware,dc=int” to connect to “localhost”

adsiedit 1

2.  Right click the “Common” folder and change the value of “pae-SSOCredentialCacheTimeout” from “-1” to whatever value you want in minutes (mine is 5 minutes).  Hit apply.

adsiedit 2

This will replicate to all the View Connection Server replicas.  I hope you take the time to close this neglected security hole.  I wish VMware have chosen a more secure default value or include this  on the Web Interface but no big deal.  

Cisco Archive command vise AAA accounting for configuration and change management

I have spent my time  for the past few weeks configuring Cisco Secure ACS  Tacacs+ for  Active Directory authentication and authorization.  The AAA accounting for change management however prove to be difficult.   I have used and setup the “Archive” feature for years now, however I did not know that I can send this to a syslog using “notify syslog”.  I actually prefer the “Archive” than the AAA accounting, it is so much simple to setup.

Switch#config term
Switch(config-archive)#log config
Switch(config-archive-log-cfg)#logging enable
Switch(config-archive-log-cfg)#logging size 500
Switch(config-archive-log-cfg)#notify syslog


Switch#sh archive log config all

The configuration I have above will track the user and all the command he/she issues  and store in on the local switch as well as send it to syslog.  May be next article I can do Cisco Secure ACS, but there’s really nothing special there, although I am using the vm version of ACS v5.3 which is probably worth mentioning.

Powershell DOD CRLAutoCache

Certificate Revocation List (CRL) are important for smartcard authentication architecture.  I have mine configured just in case our web connection goes down, user’s will still be able to login using smartcard.  This is ever more important for DoD networks since it is mandated to use smartcard for Windows authentication.  The other alternative for CRL is OCSP, which is the preferred protocol since it uses so much less network resource but does not offer protection when internet connection goes down.  I’m not going to do an in depth explanation of the two since that is not my intention for this blog.

DOD has a nifty tool called CRLAutoCache for automated downloading of CRL.  Unfortunately there is no place to specify a web proxy.  Modern networks utilize webproxy for security purpose so I created a powershell that will download the ALLCRLZIP.ZIP and unzip it to an IIS server.  FYI, this IIS server is also my WSUS so I did not introduce another IIS instance.   You can just copy and paste the code and give it a .ps1 extension (mine is CRL_DOWNLOAD.ps1)

Here it is:

#Create by Totie Bash
#This part will download the zip
$wc = new-object System.Net.WebClient
$webproxy = new-object System.Net.Webproxy(“http://proxy:80”,$true)
$source = “http://crl.disa.mil/getcrlzip?ALL+CRL+ZIP”
$destination = “C:\inetpub\wwwroot\CRL\ALLCRLZIP.ZIP”
$wc.Proxy = $WebProxy
$wc.DownloadFile($source, $destination)

#This part unzips all
$shell = new-object -com shell.application
$zip = $shell.NameSpace(“C:\inetpub\wwwroot\CRL\ALLCRLZIP.ZIP”)
$destination = $shell.namespace(“C:\inetpub\wwwroot\CRL”)
$destination.copyhere($zip.items(), 0x14)

Note:  Sorry about this:

Line 4- proxy:80&#8243  should read just “proxy:80” take off “&#8243
Line 5 – crl.disa.mil/getcrlzip?ALL+CRL+ZIP&#8221 should just read “crl.disa.mil/getcrlzip?ALL+CRL+ZIP” take off “&#8221

– I then created a scheduled task to run every every 4 in the morning to execurte the CRL_DOWNLOAD.ps1 powershell.  Note: I had to change the user that runs this process to “System” and checked “Run with highest privileges”

POWERSHELL -executionpolicy bypass “C:\WINDOWS\CRL_DOWNLOAD.ps1”

– After that I then point my Tumbleweed Desktop Validator (or any DV software) to the interal CRL address as my primary validation and push the config through GPO.      

Network Teaming Gone Wild. Can’t ping VM

If you want to save your self from the sob story and just learn the fix it’s just VMotion.

This year I had few problems with my network teaming.  I don’t know if it affects the other load balancing algorithm but I am using  “route base IP hash” to be exact.  This issue is either not mentioned or hard to find a reference fix on the web.  This is not a complaint against network teaming but rather a good samaritan contribution to the VMware community.  I have been using teaming for four years now and I am very happy with my nic teaming performance and the two times that I will describe here that went down both are not fair to blame on VSphere.  The first incident is ESXi 5.1 and the second incident is version 5.1U1.

First Incident:  happened beginning of the year 2013.  It happened when the UPS where the Cisco 3750 switch, same switch where the ESXi nic teaming is terminated went down while the ESXi host that are on different UPS remained up.

Second Incident:  happened on just 1 vm print server.  I know one of our guys in my group was messing with this switch earlier that day so I am 70% sure that whatever he did was the root cause of that one VM nic going down.  This particular one I fix over the phone.  One of our admin called me to let me know that he was going to restore this one vm from VDP after two hours of battle.  He sounds defeated but I was able to recognize the symptom so I was able to advice him to try the fix I describe further down the blog.

Symptoms:  The sympstoms for these two events is all identical.  The nic appears to be up but you can’t ping the VM and even when you log-in locally on the vm you can not ping anywhere as well.  The first symptom I described affected 90% of the vm.  The thing that made this unique and somewhat hard to troubleshoot is that it only affects some vm and not all.  The second incident I mention only affected 1 vm.  Reboot does not fix the issue.  Removing the nic and assigning a new nic does not fix the issue like one would think.

The Fix:  VMotion to a different ESXi host.  VMotion seems to jump-start the gears and get the juice going again.  I wish I can say more about the fix but I really can’t, if you notice I am just typing this so I can make it seem that the fix is complicated when it’s not.  One might ask, How did I know that VMotion will fix the issue? My technical answer is “my gut told me so” =).  I honestly remembered just stumbling on that fix.  On this issue it’s a shame that “Jump-start” and “gut”  is the closest I get of being technical.  Really it’s just VMotion.

Notes:  Some of the fix that I was not able to test are:  instead of “reboot” I should issue “reset”.  Another fix that I did not try is to briefly unplug the ethernet cable and replug it back in.  Reboot the ESXi host.

Reading material: If you want a more detailed explanation on how to troubleshoot “IP hash”, you should read Mike Da Costa’s article on this, my comments are at the buttom.  http://blogs.vmware.com/kb/2013/03/troubleshooting-network-teaming-problems-with-ip-hash.html#comment-8463

Summary:  There it is when you see the symptoms I described above and you are using nic IP hash nic teaming, remember to include VMotion as your 4th or 5th option on your tool belt.

Fragmentation issue when DF=1, Route-map to the rescue

Probably a bad title but it is the closest I can come up.

I have a weird issue at work last week where about all 100 workstations from a remote office can not go to one specific website and yet there is no issue hitting the same website at the main headquarter site. The symptom: A user would type the url and it would just sit at the browser and time-out after a minute or two.

Remote site and HQ is separated via IPSEC VPN device (KG175D), but the method works with any VPN device. A quick look at the wireshark sniffer installed at the remote site local machine reveals that DNS resolution is perfectly good and the initial TCP syn/ack is good. So the first 4 to 6 packets are good but when it tries to do “GET HTTP.ASP”, no reply packet from the destination. I then put another sniffer at the main HQ site to put another set of eyes on the packets. And I notice that the destination IS actually sending/resending the packets to the remote site but how come it is not getting through the VPN? Upon further investigation of the missing packets, here are the things wrong with it: MSS=1380; Bytes=1502; DF=1; Extra VPN device means extra headers to the packets. The big offender is the 1502 bytes. The quickest way I fix that, which serves me very well in the past is “IP TCP ADJUST-MSS 1300” on the remote sites gateway router(1300 being on the safe guestimate). This normally does the trick so layer 4 can negotiate amongst themselves the maximum segment size MSS so the application would send smaller packets rather than a huge 1502bytes. Unfortunately, despite my configuration and verifying on the sniffer that indeed the MSS I am using is 1300, for reasons unbeknownst to me the destination still insist on MSS1380 and still sends me 1502 bytes (follow up blog on this). After gathering all these data, I made a diplomatic phone call to the owner of the offending website to get the issue straightened out. I explained the situation and his website is not aggreing to my MSS1300 request and that 1502 bytes with DF=1 is too big for me. Cut the sob story short, the diplomatic approach fell flat and I am left with no option but one.

The Band-aid Fix:
Just to state the problem again and the thought process on the fix. Diplomatic approach and the MSS negotiation is dead end. Since I know that the issue is the 1502bytes being too big and that the Don’t Fragment bit DF=1 which means don’t allow fragmentation. My only option is to flip the Don’t Fragment DF bit to “0” (zero) to allow fragmentation. I could do this on the VPN device but I decided to do it at the headquarters last L3 device before the VPN device. It could be a router or a L3 switch and on my example it is a 6513 Cisco switch. Properly tune the access-list so that it will only touch the offending website and leave the other traffic untouched.

route-map clear-df permit 36
match ip address 136
set ip df 0

::access list is the offending website and is the remote site subnet. You can even put eq 80 if you want.
access-list 136 permit tcp host

::appply on interface:
ip policy route-map clear-df

After applying to the interface, I vefiried through both wireshark sniffer the the DF bit is now “0” zero, and the webpage now works. Happy ending.

Note: Just to be clear before I get ridiculed for allowing fragmentation.  Fragmentation is really not my desired solution for this issue, as a matter of fact I that is why I call this one a band-aid rather than a fix.  Modern well written application should not come to fragmentation hence most of them are df=1, but for this instance, I really don’t have a whole lot of option but to allow such fragmentation.

Wow!! another Java zero day. Virtualized browser anyone?

I’m disabling Java completely just to be on the safe side until they release a patch. We just can’t make this right huh, Oracle. Do we need Neo’s help? I wish I took the blue pill.

This would be a perfect use case for a Virtualize IE or Chrome with Java extension. Use the non Java browser for day to day browsing and just use the virtualized/sandboxed browser with Java if there is an application that needs it. Thinapp, Cameyo, Xenapp, AppV and other application virtualization software can help with this.