Quantcast
Channel: Ask the Directory Services Team
Viewing all 67 articles
Browse latest View live

The Mouse Will Play

$
0
0

Hey all, Ned here. Mike and I start teaching Windows Server 2012 and Windows 8 DS internals this month in the US and UK and won’t be back until July. Until then, Jonathan is – I can’t believe I’m saying this – in charge of AskDS. He’ll field your questions and publish… stuff. We’ll make sure he takes his medication before replying.

If you’re in Reading, England June 10-22, first round is on me.

image
I didn’t say what the first round was though.

Ned “crikey” Pyle


Important Information about Remote Desktop Licensing and Security Advisory 2718704

$
0
0

Hi folks, Jonathan here. Dave and I wanted to share some important information with you.

By now you’ve all been made aware of the Microsoft Security Advisory that was published this past Sunday.  If you are a Terminal Services or Remote Desktop Services administrator then we have some information of which you should be aware.  These are just some extra administrative steps you’ll need to follow the next time you have to obtain license key packs, transfer license key packs, or any other task that requires your Windows Server license information to be processed by the Microsoft Product Activation Clearinghouse.  Since there’s a high probability that you’ll have to do that at some point in the future we’re doing our part to help spread the word.  Our colleagues over at the Remote Desktop Services (Terminal Services) Team blog have posted all the pertinent information. Take a look.

Follow-up to Microsoft Security Advisory 2718704: Why and How to Reactivate License Servers in Terminal Services and Remote Desktop Services

If you have any questions, feel free to post them over in the Remote Desktop Services forum.

Jonathan Stephens

RSA Key Blocking is Here!

$
0
0

Hello everyone. Jonathan here again with another Public Service Announcement post.

Today, Microsoft has published a new Security Advisory:

Microsoft Security Advisory (2661254): Update For Minimum Certificate Key Length

The Security Advisory and the accompanying KB article have complete information about the software update, but the key takeaway is that this update is now available on the Download Center and the Microsoft Update Catalog. In addition, Microsoft will release this software update through Microsoft Update (aka Windows Update) in October 2012. So all of you enterprise customers have two months to start testing this update to see what impact it has in your environments.

If you want information on finding weak keys in your environment then review the KB article. It describes several methods you can use. Microsoft Support has also created a PowerShell script that has been posted to the the TechNet Script Center.

Finally, I have one final warning for those of you that use makecert.exe to create test certificates. By default, makecert.exe creates certificates that chains up to the Root Agency root CA certificate located in the Intermediate Certification Authorities store. The Root Agency CA certificate has a public key of 512 bits, so once you deploy this update no certificate created with makecert.exe will be considered valid.

You should now consider makecert.exe deprecated. As a replacement, starting with Windows 7 / Windows Server 2008 R2, you can use certreq.exe to create a self-signed certificate. For example, to create a self-signed code signing certificate you can create the following .INF file:

[NewRequest]
Subject = "CN=Self Signed Cert"
KeyLength = 2048
ProviderName = "Microsoft Enhanced Cryptographic Provider v1.0"
KeySpec = "AT_SIGNATURE"
KeyUsage = "CERT_DIGITAL_SIGNATURE_KEY_USAGE"
RequestType = Cert
SMIME = False
ValidityPeriod = Years
ValidityPeriodUnits = 2

[EnhancedKeyUsageExtension]
OID = 1.3.6.1.5.5.7.3.3

The important line above is the RequestType value. That tells certreq.exe to create a self-signed certificate. Along with that value, the ValidityPeriod and ValidityPeriodUnits values allow you specify the lifetime of the self-signed certificate.

Once you create the .INF file, run the following command:

Certreq -new  selfsigned.inf selfsigned.crt

This will take your .INF file and generate a new self-signed certificate that you can use for testing.

Ok, so this was supposed to be a short post pointing to where you need to go, but it turns out that I had some other related stuff. The important message here is go read the Security Advisory and the KB article.

Go read the Security Advisory and the KB article.

Ex pace.

Jonathan “I am the Key Master” Stephens

Updated Group Policy Search service

$
0
0

Mike here with an important service announcement.  In June of 2010, guest poster Kapil Mehra introduced the Group Policy Search service.  The Group Policy Search (GPS) service is a web application hosted on Windows Azure, which enables you to search for registry-based Group Policy settings used in Windows operating systems.

It’s a "plezz-shzaa" to announce that GPS version 1.1.4 is live at http://gps.cloudapp.net.  Version 1.1.4 includes registry-based policy settings from Windows 8 and Windows Server 2012, performance improvements, bug fixes, and a few little surprises.  It's the easiest way to search for a Group Policy setting. 

So, the next time you need to search for a Group Policy settings, or want to know the registry key and value name that backs a particular policy setting-- don't look for a antiquated settings spreadsheet reference.  Get your Group Policy Search on!!

And, if you act now-- we'll throw in the Group Policy Search Windows Phone 7 application-- for free! That's right, take Group Policy Search with you on the go. What an offer! Group Policy Search and Group Policy Search Windows Phone 7 application -- for one low, low price -- FREE!  Act now and you'll get free shipping.

This is Mike Stephens and "Ned Pyle" approves this message!

Let the Blogging begin…

$
0
0

Hello AskDS Readers. Mike here again. If you notice, Ned posted one of our first Windows Server 2012 RTM blogs a while back (Managing RID Issuance in Windows Server 2012). Yes friends, the gag order has been lifted and we are allowed to spout mountains of technical goodness about Windows Server 2012 and Windows 8.

"So much time and so little to do. Wait a minute. Strike that. Reverse it." Windows Server 2012 has many cool features that Ned and I have been waiting to share with you. Here is a 50,000-foot view of the technologies and features we are going to blog in the next few weeks and months-- in no specific order.

I'll start by highlighting some of the changes with security, PKI, authentication, and authorization. The Windows Server 2012 Certificate Services role has a few feature changes that should delight many of the certificate administrators out there. With new installation, deployment, and improved configuration-- it's probably the easiest certificate authority to configure.

Windows Server 2012 authentication is a healthy technology with a ton of technical goo just seeping at the seams; starting with the mac-daddy of them all-- Kerberos. In a few weeks, we will begin publishing the first of many installments of Kerberos changes in Windows 8/Windows Server 2012. As a teaser, the lineup includes KDC Proxy Server, the latest and greatest way to configured Kerberos Constrained Delegation-- "It really whips the lama's @#%." We'll take some exhaustive time explaining some Kerberos enhancements such as Kerberos Armoring and Compound Identity. We have tons more to share in the area of authentication including Virtual Smartcard Readers, and Picture Password logon.

Advanced client security highlights features like Server Name Indicator (SNI) for Windows Server 2012, Certificate Lifecycle Notification, Weak Key Protection (most of which is published in Jonathan Stephen's latest blog, RSA Key Blocking is Here!), Implicit binding, which is the infrastructure behind the new Centralized Certificate Store IIS feature, and Client certificate hints. Advanced client security also includes a wicked-cool security-enhancement to PFX files and new a PKI module for Windows PowerShell

At some point in our publishing timeline, we'll launch into the saga of all sagas, Dynamic Access Control. We've hosted guest posts here on AskDS to introduce this radical, amazingly cool new way to perform file-based authorization. This isn't your grandfather's authorization either. Dynamic Access Control or DAC as we’ll call it, requires planning, diligence, and an understanding of many dependencies, such as Active Directory, Kerberos, and effective access. Did I mention there are many knobs you must turn to configure it? No worries though, we'll break DAC down into consumable morsels that should make it easy for everyone to understand.

The concept of claims continues by showing you how to use Windows Server 2012's Active Directory Federation Services role to leverage claims issued by Windows domain controllers. Using AD FS, you can pass-through the Windows authorization claims or transform them into well-known SAML-based claim types.

No, I'm not done yet. I'm going introduce a well-hidden feature that hasn't received much exposure, but has been labeled "pretty cool" by many training attendees. Access Denied Assistance is a gem of a feature that is locked away within the File Server Resource Manager (FSRM). It enables you to provide a SharePoint-like experience for users in Windows Explorer when they experience access denied or file not found to a shared file or folder. Access Denied Assistance provides the user with a "Request Access" interface that sends an email to the share owner that provides details on the access requested and guidance for the share owner can follow to remediate the problem. It's very slick.

Wait there is more; this is just my list of topics to cover. Ned has a fun-bag full of Active Directory related material that he'll intermix with these topics to keep things fresh. I'm certain we'll sneak in a few extras that may not be directly related to Directory Services; however, they will help you make your Windows Server 2012 and Windows 8 experience much better. Need to run for now, this blog post just wrote checks my body can't cash.

The line above and below this were intentionally left blank using Microsoft Word 2013 Preview Edition

Mike "There's no earthly way of knowing; which direction they are going... There's no knowing where they're rowing..." Stephens

MaxTokenSize and Windows 8 and Windows Server 2012

$
0
0

Hello AskDS Populous, Mike here and I want to share with you some of the excellent enhancements we accomplished in Windows 8 and Windows Server 2012 around MaxTokenSize. Let’s review MaxTokenSize and its symptoms before we jump in to wonderful world of Windows 8 (say that three times fast).

Wonderful World of Windows 8
Wonderful World of Windows 8
Wonderful World of Windows 8

What is MaxTokenSize

Kerberos is the default and preferred authentication protocol since the release of Windows 2000 Server. Over the last few years, Microsoft has made some significant investments in provided extensions to the protocol. One of those extensions to Kerberos is the Privilege Attribute Certificate or PAC (defined in Windows Server Protocol specification MS-PAC).

Microsoft created the PAC to encapsulate authorization related information in a manner consistent with RFC4120. The authorization information included in the PAC includes security identifiers, user profile information such as Full name, home directory, and bad password count. Security identifiers (SIDs) included in the PAC represent the user's current SID and any instances of SID history and security group memberships to the extent of current domain groups, resource domain groups, and universal groups.

Kerberos uses a buffer to store authorization information and reports this size to applications using Kerberos for authentication. MaxTokenSize is the size of buffer used to store authorization information. This buffer size is important because some protocols such as RPC and HTTP use it when they allocate memory for authentication. If the authorization data for a user attempting to authenticate is larger than the MaxTokenSize, then the authentication fails for that connection using that protocol. This explains why authentication failures resulted when authenticating to IIS but not when authenticating to folder shared on a file server. The default buffer size for Kerberos in Windows 7 and Windows Server 2008R2 is 12k.

Windows 8 and Windows Server 2012

Let's face the facts of today's IT environment… authentication and authorization is not getting easier; it's becoming more complex. In the world of single sign-on and user claims, the amount of authorization data is increasing. Increasing authorization data in an infrastructure that has already had its experiences with authentication failures because a user was a member of too many groups justifies some concern for the future. Fortunately, Windows 8 and Windows Server 2012 have features to help us take proactive measures to avoid the problem.

Default MaxTokenSize

Windows 8 and Windows Server 2012 benefit from an increased MaxTokenSize of 48k. Therefore, when HTTP relies on the MaxTokenSize value as the value used for memory allocation; it will allocate 48k of memory for the authentication buffer, which hold a substantially more authorization information than in previous versions of Windows where the default MaxTokenSize was only 12k.

Group Policy settings

Windows 8 and Windows Server 2012 introduce two new computer-based policy settings that help combat against large service tickets, which is the cause of the MaxTokenSize dilemma. The first of these policy settings is not exactly new-- it has been in Windows for years, but only as a registry value. Use the policy setting Set maximum Kerberos SSPI context token buffer size to change the MaxTokenSize using group policy. Looking closely at this policy setting in the Group Policy Management Editor, you'll notice the icon for this setting is slightly different from the others around it.

clip_image001

This difference is attributed to registry location the policy setting modifies when enabled or disabled. This registry setting is the actual MaxTokenSize registry key and value name that has been used in earlier versions of Windows

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa\Kerberos\Parameters\MaxTokenSize

Therefore, you can use this computer-based policy setting to manage Windows 8, Windows Server 2012, and earlier versions of Windows. The catch here is that this registry location is not a managed policy location. Managed policy locations are removed and reapplied during policy refreshes to avoid persistent settings in the registry after the settings in a Group Policy object become out of scope. That behavior does not occur with this key, as the setting applied by this policy setting is not removed during application. Therefore, the policy setting persists even if the Group Policy object providing the setting falls out of scope.

The second policy setting is very cool and answers the question that customers always asked when they encounter a problem with MaxTokenSize: "How big is the token?" You might be one of those people that went on the crusade of a lifetime using TOKENSZ.EXE and spent countless hours trying to determine the optimal MaxTokenSize for your environment. Those days are gone.

A new KDC policy settings Warning events for large Kerberos tickets provides you with a way to monitor the size of Kerberos tickets issued by KDCs. When you enable this policy setting, you then must configure a ticket threshold size. The KDC uses the ticket threshold size to determine if it should write a warning event to the system event log. If the KDC issues a ticket that exceeds the ticket threshold size, then it writes a warning. This policy setting, when enabled, defaults to the 12k, which is the default MaxTokenSize of previous version of Windows.

clip_image003

Ideally, if you use this policy setting, then you'd likely want to set the ticket threshold value to approximately 1k less than your current MaxTokenSize. You want it lower than your current MaxTokenSize (unless you are using 12k, that is the minimum value) so you can use the warning events as a proactive measure to avoid an authentication failure due to an incorrectly sized buffer. Setting the threshold too low will just train you to ignore the Event 31 warnings because they'll become noise in the event log. Setting it too high and you're likely to be blindsided with authentication failures rather than warning events.

clip_image004

Earlier I said that this policy setting solves your problems with fumbling with TOKENSZ and other utilities to determine MaxTokenSize-- here's how. If you examine the details of the Kerberos-Key-Distribution-Center Warning event ID 31, you'll notice that it gives you all the information you need to determine the optimal MaxTokenSize in your environment. In the following example, the user Ned is a member of over 1000 groups (he's very popular and a big deal on the Internet). When I attempt to log on Ned using the RUNAS command, I generated an Event ID 31. The event description provides you with the service principal name, the user principal name, the size of the ticket requested and the size of the threshold. This enables you to aggregate all the event 31s and identify the maximum ticket size requested. Armed with this information, you can set the optimal MaxTokenSize for your environment.

clip_image006

KDC Resource SID Compression

Kerberos authentication inserts security identifiers (SIDs) of the security principal, SID history, all the groups to which the user is a member including universal groups and groups from the resource domain. Security principals with too many group memberships greatly affect the size of the authentication data. Sometimes the authentication data is larger than the allocated size reported by Kerberos to applications. This can causes authentication failure in some applications. SIDs from the resource domain share the same domain portion of the SID, these SIDs can be compressed by only providing the resource domain SID once for all SIDs in the resource domain.

Windows Server 2012 KDCs help reduce the size of the PAC by taking advantage of resource SID compression. By default, a Windows Server 2012 KDC will always compress resource SIDs. To compress resource SIDs, the KDC stores SID of the resource domain to which the target resource is a member.  Then, it inserts only the RID portion of each resource SID into the ResourceGroupIds portion of the authentication data. 

Resource SID Compression reduces the size of each stored instance of a resource SID because the domain SID is stored once rather than with each instance. Without resource SID Compression, the KDC inserts all the SIDs added by the resource domain in the Extra-SID portion of the PAC structure, which is a list of SIDs.  [MS-KILE]

Interoperability

Other Kerberos implementations may not understand resource group compression and therefore are not compatible. In these scenarios, you may need to disable resource group compression to allow the Windows Server 2012 KDC to interoperate with the third-party Kerberos implementation.

Resource SID compression is on by default; however, you can disable it. You disable resource SID compression on a Windows Server 2012 KDC using the DisableResourceGroupsFields registry value under the HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System\Kdc\Parameters registry key. This registry value has a DWORD registry value type. You completely disable resource SID compression when you set the registry value to 1. The KDC reads this configuration when building a service ticket. With the bit enabled, the KDC does not use resource SID compression when building the service ticket.

Wrap up

There's the skinny on the Kerberos enhancements included in Windows 8 and Windows Server 2012 that specifically target large service ticket and MaxTokenSize scenarios. To summarize:

· Increased default MaxTokenSize from 12k to 48k

· New Group Policy setting to centrally manage MaxTokenSize

· New Group Policy setting to write warnings to the system event log when a service ticket exceeds a designated threshold

· New Resource SID compression to reduce the storage size of SIDs from the resource domain

Keep an eye out for more Windows 8 and Kerberos needful

Mike "~Mike" Stephens

Windows Server 2012 Shell game

$
0
0

Here's the scenario, you just downloaded the RTM ISO for Windows Server 2012 using your handy, dandy, "wondermus" Microsoft TechNet subscription. Using Hyper-V, you create a new virtual machine, mount the ISO and breeze through the setup screen until you are mesmerized by the Newton's cradle-like experience of the circular progress indicator

clip_image002

Click…click…click…click-- installation complete; the computer reboots.

You provide Windows Server with a new administrator password. Bam: done! Windows Server 2012 presents the credential provider screen and you logon using the newly created administrator account, and then…

Holy Shell, Batman! I don't have a desktop!

clip_image004

Hey everyone, Mike here again to bestow some Windows Server 2012 lovin'. The previously described scenario is not hypothetical-- many have experienced it when they installed the pre-release versions of Windows Server 2012. And it is likely to resurface as we move past Windows Server 2012 general availability on September 4. If you are new to Windows Server 2012, then you're likely one of those people staring at a command prompt window on your fresh installation. The reason you are staring at command prompt is that Windows Server 2012's installation defaults to Server Core and in your haste to try out our latest bits, you breezed right past the option to change it.

This may be old news for some of you, but it is likely that one or more of your colleagues is going to perform the very actions that I describe here. This is actually a fortunate circumstance as it enables me to introduce a new Windows Server 2012 feature.

clip_image006

There were two server installation types prior to Windows Server 2012: full and core. Core servers provide a low attack surface by removing the Windows Shell and Internet Explorer completely. However, it presented quite a challenge for many Windows administrators as Windows PowerShell and command line utilities were the only methods used to manage the servers and its roles locally (you could use most management consoles remotely).

Those same two server installation types return in Windows Server 2012; however, we have added a third installation type: Minimal Server Interface. Minimal Server Interface enables most local graphical user interface management tasks without requiring you to install the server's user interface or Internet Explorer. Minimal Server Interface is a full installation of Windows that excludes:

  • Internet Explorer
  • The Desktop
  • Windows Explorer
  • Windows 8-style application support
  • Multimedia support
  • Desktop Experience

Minimal Server Interface gives Windows administrators - who are not comfortable using Windows PowerShell as their only option - the benefit a reduced attack surface and reboot requirement (i.e., on Patch Tuesday); yet GUI management while the ramp on their Windows PowerShell skills.

clip_image008

"Okay, Minimal Server Interface seems cool Mike, but I'm stuck at the command prompt and I want graphical tools. Now what?" If you were running an earlier version of Windows Server, my answer would be reinstall. However, you're running Windows Server 2012; therefore, my answer is "Install the Server Graphical Shell or Install Minimal Server Interface."

Windows Server 2012 enables you to change the shell installation option after you've completed the installation. This solves the problem if you are staring at a command prompt. However, it also solves the problem if you want to keep your attack surface low, but simply are a Windows PowerShell guru in waiting. You can choose Minimal Server Interface ,or you can decided to add the Server Graphical Interface for a specific task, and then remove it when you have completed that management task (understand, however, that switching between the Windows Shell requires you to restart the server).

Another scenario solved by the ability to add the Server Graphical Shell is that not all server-based applications work correctly on server core, or you cannot management them on server core. Windows Server 2012 enables you to try the application on Minimal Server Interface and if that does not work, and then you can change the server installation to include the Graphical Shell, which is the equivalent of the Server GUI installation option during the setup (the one you breezed by during the initial setup).

Removing the Server Graphical Shell and Graphical Management Tools and Infrastructure

Removing the Server shell from a GUI installation of Windows is amazingly easy. Start Server Manager, click Manage, and click Remove Roles and Features. Select the target server and then click Features. Expand User Interfaces and Infrastructure.

To reduce a Windows Server 2012 GUI installation to a Minimal Server Interface installation, clear the Server Graphical Shell checkbox and complete the wizard. To reduce a Windows Server GUI installation to a Server Core installation, clear the Server Graphical Shell and Graphical Management Tools and Infrastructure check boxes and complete the wizard.

clip_image010

Alternatively, you can perform these same actions using the Server Manager module for Windows PowerShell, and it is probably a good idea to learn how to do this. I'll give you two reasons why: It's wicked fast to install and remove features and roles using Windows PowerShell and you need to learn it in order to add the Server Shell on a Windows Core or Minimal Server Interface installation.

Use the following command to view a list of the Server GUI components

clip_image011

Get-WindowsFeature server-gui*

Give your attention to the Name column. You use this value with the Remove-WindowsFeature and Install-WindowsFeature PowerShell cmdlets.

To remove the server graphical shell, which reduces the GUI server installation to a Minimal Server Interface installation, run:

Remove-WindowsFeature Server-Gui-Shell

To remove the Graphical Management Tools and Infrastructure, which further reduces a Minimal Server Interface installation to a Server Core installation.

Remove-WindowsFeature Server-Gui-Mgmt-Infra

To remove the Graphical Management Tools and Infrastructure and the Server Graphical Shell, run:

Remove-WindowsFeature Server-Gui-Shell,Server-Gui-Mgmt-Infra

Adding Server Graphical Shell and Graphical Management Tools and Infrastructure

Adding Server Shell components to a Windows Server 2012 Core installation is a tad more involved than removing them. The first thing to understand with a Server Core installation is the actual binaries for Server Shell do not reside on the computers. This is how a Server Core installation achieves a smaller footprint. You can determine if the binaries are present by using the Get-WindowsFeature Windows PowerShell cmdlets and viewing the Install State column. The Removed value indicates the binaries that represent the feature do not reside on the hard drive. Therefore, you need to add the binaries to the installation before you can install them. Another indicator that the binaries do not exist in the installation is the error you receive when you try to install a feature that is removed. The Install-WindowsFeature cmdlet will proceed along as if it is working and then spend a lot of time around 63-68 percent before returning an error stating that it could not add the feature.

clip_image015

To stage Server Shell features to a Windows Core Installation

You need to get our your handy, dandy media (or ISO) to stage the binaries into the installation. Windows installation files are stored in WIM files that are located in the \sources folder of your media. There are two .WIM files on the media. The WIM you want to use for this process is INSTALL.WIM.

clip_image017

You use DISM.EXE to display the installation images and their indexes that are included in the WIM file. There are four images in the INSTALL.WIM file. Images with the index of 1 and 3 are Server Core installation images for Standard and Datacenter, respectively. Images with the indexes 2 and 4 are GUI installation of Standards and Datacenter, respectively. Two of these images contain the GUI binaries and two do not. To stage these binaries to the current installation, you need to use indexes 2 and 4 because these images contain the Server GUI binaries. An attempt to stage the binaries using indexes 1 or 3 will fail.

You still use the Install-WindowsFeature cmdlets to stage the binaries to the computer; however, we are going to use the -source argument to inform Install-WindowsFeature the image and index it should use to stage the Server Shell binaries. To do this, we use a special path syntax that indicates the binaries reside in a WIM file. The Windows PowerShell command should look like

Install-WindowsFeature server-gui-mgmt-infra,server-gui-shell -source:wim:d:\sources\install.wim:4

Pay particular attention to the path supplied to the -source argument. You need to prefix the path to your installation media's install.wim file with the keyword wim: You need to suffix the path with a :4, which represents the image index to use for the installation. You must always use an index of 2 or 4 to install the Server Shell components. The command should exhibit the same behavior as the previous one and proceeds up to about 68 percent, at which point it will stay at 68 percent for a quite a bit, (if it is working). Typically, if there is a problem with the syntax or the command it will error within two minutes of spinning at 68 percent. This process stages all the graphical user interface binaries that were not installed during the initial setup; so, give it a bit of time. When the command completes successfully, it should instruct you to restart the server. You can do this using Windows PowerShell by typing the Restart-Computer cmdlets.

clip_image019

Give the next reboot more time. It is actually updating the current Windows installation, making all the other components aware the GUI is available. The server should reboot and inform you that it is configuring Windows features and is likely to spend some time at 15 percent. Be patient and give it time to complete. Windows should reach about 30 percent and then will restart.

clip_image021

It should return to the Configuring Windows feature screen with the progress around 45 to 50 percent (these are estimates). The process should continue until 100 percent and then should show you the Press Ctrl+Alt+Delete to sign in screen

clip_image023

Done

That's it. Consider yourself informed. The next time one of your colleagues gazes at their accidental Windows Server 2012 Server Core installation with that deer-in-the-headlights look, you can whip our your mad Windows PowerShell skills and turn that Server Core installation into a Minimal Server Interface or Server GUI installation in no time.

Mike

"Voilà! In view, a humble vaudevillian veteran, cast vicariously as both victim and villain by the vicissitudes of Fate. This visage, no mere veneer of vanity, is a vestige of the vox populi, now vacant, vanished. However, this valorous visitation of a by-gone vexation, stands vivified and has vowed to vanquish these venal and virulent vermin van-guarding vice and vouchsafing the violently vicious and voracious violation of volition. The only verdict is vengeance; a vendetta, held as a votive, not in vain, for the value and veracity of such shall one day vindicate the vigilant and the virtuous. Verily, this vichyssoise of verbiage veers most verbose, so let me simply add that it's my very good honor to meet you and you may call me V."

Stephens

....And knowing is half the battle!


Revenge of Y2K and Other News

$
0
0

Hello sports fans!

So this has been a bit of a hectic time for us, as I'm sure you can imagine. Here's just some of the things that have been going on around here.

Last week, thanks to a failure on the time servers at USNO.NAVY.MIL, many customers experienced a time rollback to CY 2000 on their Active Directory domain controllers. Our team worked closely with the folks over at Premier Field Engineering to explain the problem, document resolutions for the various issues that might arise, and describe how to inoculate your DCs against a similar problem in the future. If you were affected by this problem then you need to read this post. If you weren't affected, and want to know why, then you need to read this post. Basically, we think you need to read this post. So...here's the link to the AskPFEPlat blog.

In other news, Ned Pyle has successfully infiltrated the Product Group and has started blogging on The Storage Team blog. His first post is up, and I'm sure there will be many more to follow. If you've missed Ned's rare blend of technical savvy and sausage-like prose, and you have an interest in Microsoft's DFSR and other storage technologies, then go check him out.

Finally...you've probably noticed the lack of activity here on the AskDS blog. Truthfully, that's been the result of a confluence of events -- Ned's departure, the Holiday season here in the US, and the intense interest in Windows 8 and Windows Server 2012 (and subsequent support calls). Never fear, however! I'm pleased to say that your questions to the blog have been coming in quite steadily, so this week I'll be posting an omnibus edition of the Mail Sack. We also have one or two more posts that will go up between now and the end of the year, so there's that to look forward to. Starting with the new calendar year, we'll get back to a semi-regular posting schedule as we get settled and build our queue of posts back up.

In the mean time, if you have questions about anything you see on the blog, don't hesitate to contact us.

Jonathan "time to make the donuts" Stephens

Intermittent Mail Sack: Must Remember to Write 2013 Edition

$
0
0

Hi all, Jonathan here again with the latest edition of the Intermittent Mail Sack. We've had some great questions over the last few weeks so I've got a lot of material to cover. This sack, we answer questions on:

Before we get started, however, I wanted to share information about a new service available to Premier customers through Microsoft Services Premier Support. Many Premier customers will be familiar with the Risk Assessment Program (RAP). Premier Support is now rolling out an online offering called the RAP as a Service (or RaaS for short). Our colleagues over on the Premier Field Engineering (PFE) blog have just posted a description of the new offering, and I encourage you to check it out. I've been working on the Active Directory RaaS offering since the early beta, and we've gotten really good feedback. Unfortunately, the offering is not yet available to non-Premier customers; look at RaaS as yet one more benefit to a Premier Support contract.

 

Now on to the Mail Sack!

Question

I'm considering upgrading my DFSR hub servers to Server 2012. Is there anything I should know before I hit the easy button and do an upgrade?

Answer

The most important thing to note is that Microsoft strongly discourages mixing Windows Server 2012 and legacy operating system DFSR. You just mentioned upgrading your hub servers, and make no mention of any branch servers. If you're going to upgrade your DFSR servers then you should upgrade all of them.

Check out Ned's post over on the FileCab blog: DFS Replication Improvements in Windows Server. Specifically, review the section that discusses Dynamic Access Control Support.

Also, there is a minor issue that has been found that we are still tracking. When you upgrade from Windows Server 2008 R2 to Windows Server 2012 the DFS Management snap-in stops working. The workaround is to just uninstall and then reinstall the DFS Management tools:

You can also do this with PowerShell:

Uninstall-WindowsFeature -name RSAT-DFS-Mgmt-Con
Install-WindowsFeature -name RSAT-DFS-Mgmt-Con

 

Question

From our SharePoint site, when users click on log-off then they get sent to this page: https://your_sts_server/adfs/ls/?wa=wsignout1.0.

We configured the FedAuth cookie to be session based after we did this:

$sts = Get-SPSecurityTokenServiceConfig 
$sts.UseSessionCookies = $true 
$sts.Update() 

 

The problem is, unless the user closes all their browsers then when they go to the log-in page the browser remembers their credentials. This is not acceptable for some PC's are shared by people. Also, closing all browsers is not acceptable as they run multiple web applications.

Answer

(Courtesy of Adam Conkle)

Great question! I hope the following details help you in your deployment:

Moving from a persistent cookie to a session cookie with SharePoint 2010 was the right move in this scenario in order to guarantee that closing the browser window would terminate the session with SharePoint 2010.

When you sign out via SharePoint 2010 and are redirected to the STS URL containing the query string: wa=wsignout1.0, this is what we call a WS-Federation sign-out request. This call is sufficient for signing out of the STS as well as all relying parties signed into during the session.

However, what you are experiencing is expected behavior for how Integrated Windows Authentication (IWA) works with web browsers. If your web browser client experienced either a no-prompt sign-in (using Kerberos authentication for the currently signed in user), or NTLM, prompted sign-in (provided credentials in a Windows Authentication "401" credential prompt), then the browser will remember the Windows credentials for that host for the duration of the browser session.

If you were to collect a HTTP headers trace (Fiddler, HTTPWatch, etc.) of the current scenario, you will see that the wa=wsignout1.0 request is actually causing AD FS and SharePoint 2010 (and any other RPs involved) to clean up their session cookies (MSISAuth and FedAuth) as expected. The session is technically ending the way it should during sign-out. However, if the client keeps the current browser session open, browsing back to the SharePoint site will cause a new WS-Federation sign-in request to be sent to AD FS (wa=wsignin1.0). When the sign-in request is sent to AD FS, AD FS will attempt to collect credentials with a HTTP 401, but, this time, the browser has a set of Windows credentials ready to provide to that host.

The browser provides those Windows credentials without a prompt shown to the user, and the user is signed back into AD FS, and, thus, is signed back into SharePoint 2010. To the naked eye, it appears that sign-out is not working properly, while, in reality, the user is signing out and then signing back in again.

To conclude, this is by-design behavior for web browser clients. There are two workarounds available:

Workaround 1

Switch to forms-based authentication (FBA) for the AD FS Federation Service. The following article details this quick and easy process: AD FS 2.0: How to Change the Local Authentication Type

Workaround 2

Instruct your user base to always close their web browser when they have finished their session

Question

Are the attributes for files and folders used by Dynamic Access Control are replicated with the object? That is, using DFSR, if I replicate the file to another server which uses the same policy will the file have the same effective permissions on it?

Answer

(Courtesy of Mike Stephens)

Let me clarify some aspects of your question as I answer each part

When enabling Dynamic Access Control on files and folders there are multiple aspects to consider that are stored on the files and folders.

Resource Properties

Resource Properties are defined in AD and used as a template to stamp additional metadata on a file or folder that can be used during an authorization decision. That information is stored in an alternate data stream on the file or folder. This would replicate with the file, the same as the security descriptor.

Security Descriptor

The security descriptor replicates with the file or folder. Therefore, any conditional expression would replicate in the security descriptor.

All of this occurs outside of Dynamic Access Control -- it is a result of replicating the file throughout the topology, for example, if using DFSR. Central Access Policy has nothing to do with these results.

Central Access Policy

Central Access Policy is a way to distribute permissions without writing them directly to the DACL of a security descriptor. So, when a Central Access Policy is deployed to a server, the administrator must then link the policy to a folder on the file system. This linking is accomplished by inserting a special ACE in the auditing portion of the security descriptor informs Windows that the file/folder is protected by a Central Access Policy. The permissions in the Central Access Policy are then combined with Share and NTFS permissions to create an effective permission.

If the a file/folder is replicated to a server that does not have the Central Access Policy deployed to it then the Central Access Policy is not valid on that server. The permissions would not apply.

Question

I read the post located here regarding the machine account password change in Active Directory.

Based on what I read, if I understand this correctly, the machine password change is generated by the client machine and not AD. I have been told, (according to this post, inaccurately) that AD requires this password reset or the machine will be dropped from the domain.

I am a Macintosh systems administrator, and as you probably know, this issue does indeed occur on Mac systems.

I have reset the password reset interval to be various durations from fourteen days which is the default, to one day.

I have found that if I disjoin and rejoin the machine to the domain it will generate a new password and work just fine for 30 days. At that time, it will be dropped from the domain and have to be rejoined. This is not 100% of the time, however it is often enough to be a problem for us as we are a higher education institution which in addition to our many PCs, also utilizes a substantial number of Macs. Additionally, we have a script which runs every 60 days to delete machine accounts from AD to keep it clean, so if the machine has been turned off for more than 60 days, the account no longer exists.

I know your forte is AD/Microsoft support, however I was hoping that you might be able to offer some input as to why this might fail on the Macs and if there is any solution which we could implement.

Other Mac admins have found workarounds like eliminating the need for the pw reset or exempting the macs from the script, but our security team does not want to do this.

Answer

(Courtesy of Mike Stephens)

Windows has a security policy feature named Domain member: Disable machine account password change, which determines whether the domain member periodically changes its computer account password. Typically, a mac, linux, or unix operating system uses some version of Samba to accomplish domain interoperability. I'm not familiar with these on the mac; however, in linux, you would use the command

Net ads changetrustpw 

 

By default, Windows machines initiate a computer password change every 30 days. You could schedule this command to run every 30 days once it completes successfully. Beyond that, basically we can only tell you how to disable the domain controller from accepting computer password changes, which we do not encourage.

Question

I recently installed a new server running Windows 2008 R2 (as a DC) and a handful of client computers running Windows 7 Pro. On a client, which is shared by two users (userA and userB), I see the following event on the Event Viewer after userA logged on.

Event ID: 45058 
Source: LsaSrv 
Level: Information 
Description: 
A logon cache entry for user userB@domain.local was the oldest entry and was removed. The timestamp of this entry was 12/14/2012 08:49:02. 

 

All is working fine. Both userA and userB are able to log on on the domain by using this computer. Do you think I have to worry about this message or can I just safely ignore it?

Fyi, our users never work offline, only online.

Answer

By default, a Windows operating system will cache 10 domain user credentials locally. When the maximum number of credentials is cached and a new domain user logs onto the system, the oldest credential is purged from its slot in order to store the newest credential. This LsaSrv informational event simply records when this activity takes place. Once the cached credential is removed, it does not imply the account cannot be authenticated by a domain controller and cached again.

The number of "slots" available to store credentials is controlled by:

Registry path: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon
Setting Name: CachedLogonsCount
Data Type: REG_SZ
Value: Default value = 10 decimal, max value = 50 decimal, minimum value = 1

Cached credentials can also be managed with group policy by configuring:

Group Policy Setting path: Computer Configuration\Policies\Windows Settings\Security Settings\Local Policies\Security Options.
Group Policy Setting: Interactive logon: Number of previous logons to cache (in case domain controller is not available)

The workstation the user must have physical connectivity with the domain and the user must authenticate with a domain controller to cache their credentials again once they have been purged from the system.

I suspect that your CachedLogonsCount value has been set to 1 on these clients, meaning that that the workstation can only cache one user credential at a time.

Question

In Windows 7 and Server 2008 Kerberos DES encryption is disabled by default.

At what point will support for DES Kerberos encryption be removed? Does this happen in Windows 8 or Windows Server 2012, or will it happen in a future version of Windows?

Answer

DES is still available as an option on Windows 8 and Windows Server 2012, though it is disabled by default. It is too early to discuss the availability of DES in future versions of Windows right now.

There was an Advisory Memorandum published in 2005 by the Committee on National Security Systems (CNSS) where DES and all DES-based systems (3DES, DES-X) would be retired for all US Government uses by 2015. That memorandum, however, is not necessarily a binding document. It is expected that 3DES/DES-X will continue to be used in the private sector for the foreseeable future.

I'm afraid that we can't completely eliminate DES right now. All we can do is push it to the back burner in favor of newer and better algorithms like AES.

Question

I have two Issuing certification authorities in our corporate network. All our approved certificate templates are published on both issuing CAs. We would like to enable certificate renewals from Internet with our Internet-facing CEP/CES configured for certificate authentication in Certificate Renewal Mode Only. What we understand from the whitepaper is that it's not going to work when the CA that issues the certificate must be the same CA used for certificate renewal.

Answer

First, I need to correct an assumption made based on your reading of the whitepaper. There is no requirement that, when a certificate is renewed, the renewal request be sent to the same CA as that that issued the original certificate. This means that your clients can go to either enrollment server to renew the certificate. Here is the process for renewal:

  1. When the user attempts to renew their certificate via the MMC, Windows sends a request to the Certificate Enrollment Policy (CEP) server URL configured on the workstation. This request includes the template name of the certificate to be renewed.
  2. The CEP server queries Active Directory for a list of CAs capable of issuing certificates based on that template. This list will include the Certificate Enrollment Web Service (CES) URL associated with that CA. Each CA in your environment should have one or more instances of CES associated with it.
  3. The list of CES URLs is returned to the client. This list is unordered.
  4. The client randomly selects a URL from the list returned by the CEP server. This random selection ensures that renewal requests are spread across all returned CAs. In your case, if both CAs are configured to support the same template, then if the certificate is renewed 100 times, either with or without the same key, then that should result in a nearly 50/50 distribution between the two CAs.

The behavior is slightly different if one of your CAs goes down for some reason. In that case, should clients encounter an error when trying to renew a certificate against one of the CES URIs then the client will failover and use the next CES URI in the list. By having multiple CAs and CES servers, you gain high availability for certificate renewal.

Other Stuff

I'm very sad that I didn't see this until after the holidays. It definitely would have been on my Christmas list. A little pricey, but totally geek-tastic.

This was also on my list, this year. Go Science!

Please do keep those questions coming. We have another post in the hopper going up later in the week, and soon I hope to have some Windows Server 2012 goodness to share with you. From all of us on the Directory Services team, have a happy and prosperous New Year!

Jonathan "13th baktun" Stephens

 

 

Speaking in Ciphers and other Enigmatic tongues…update!

$
0
0

Hi! Jim Tierney here again to talk to you about Cryptographic Algorithms, SCHANNEL and other bits of wonderment. My original post on the topic has gone through a rewrite to bring you up to date on recent changes in this space.

So, your company purchases this new super awesome vulnerability and compliance management software suite, and they just ran a scan on your Windows Server 2008 domain controllers and lo! The software reports back that you have weak ciphers enabled, highlighted in RED, flashing, with that “you have failed” font, and including a link to the following Microsoft documentation –

KB245030 How to Restrict the Use of Certain Cryptographic Algorithms and Protocols in Schannel.dll:

http://support.microsoft.com/kb/245030/en-us

The report may look similar to this:

SSL Server Has SSLv2 Enabled Vulnerability port 3269/tcp over SSL

THREAT:


The Secure Socket Layer (SSL) protocol allows for secure communication between a client and a server.


There are known flaws in the SSLv2 protocol. A man-in-the-middle attacker can force the communication to a less secure level and then attempt to break the weak encryption. The attacker can also truncate encrypted messages.

SOLUTION:


Disable SSLv2.

Upon hearing this information, you fire up your browser and read the aforementioned KB 245030 top to bottom and RDP into your DC’s and begin checking the locations specified by the article. Much to your dismay you notice the locations specified in the article are not correct concerning your Windows 2008 R2 DC’s. On your 2008 R2 DC’s you see the following at this registry location

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL:

clip_image001

“Darn you Microsoft documentation!!!!!!” you scream aloud as you shake your fist in the general direction of Redmond, WA….

This is how it looks on a Windows 2003 Server:

clip_image002

Easy now…

The registry key’s and their content in Windows Server 2008, Windows 7, Windows Server 2008 R2, Windows 2012 and 2012 R2 look different from Windows Server 2003 and prior.

Here is the registry location on Windows 7 – 2012 R2 and its default contents:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel]
“EventLogging”=dword:00000001

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Ciphers]


[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\CipherSuites]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Hashes]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\KeyExchangeAlgorithms]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0\Client]
“DisabledByDefault”=dword:00000001

Allow me to explain the above content that is displayed in standard REGEDIT export format:

  • The Ciphers key should contain no values or subkeys
  • The CipherSuites key should contain no values or subkeys
  • The Hashes key should contain no values or subkeys
  • The KeyExchangeAlgorithms key should contain no values or subkeys

The Protocols key should contain the following sub-keys and value:

Protocols
      SSL 2.0
         Client
             DisabledByDefault REG_DWORD 0x00000001 (value)

The following table lists the Windows SCHANNEL protocols and whether or not they are enabled or disabled by default in each operating system listed:

image

*Remember to install the following update if you plan on or are currently using SHA512 certificates:
SHA512 is disabled in Windows when you use TLS 1.2
http://support.microsoft.com/kb/2973337/EN-US

Similar to Windows Server 2003, these protocols can be disabled for the server or client architecture. Meaning that either the protocol can be omitted from the list of supported protocols included in the Client Hello when initiating an SSL connection, or it can be disabled on the server so that even if a client requests SSL 2.0 in a client hello, the server will not respond with that protocol.

The client and server subkeys designate each protocol. You can disable a protocol for either the client or the server, but disabling Ciphers, Hashes, or CipherSuites affects BOTH client and server sides. You would have to create the necessary subkeys beneath the Protocols key to achieve this.

For example:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0\Client]

“DisabledByDefault”=dword:00000001

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0\Server]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 3.0]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 3.0\Client]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 3.0\Server]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.0]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.0\Client]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.0\Server]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.1]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.1\Client]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.1\Server]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.2]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.2\Client]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.2\Server]

This is how it looks in the registry after they have been created:

clip_image005

Client SSL 2.0 is disabled by default on Windows Server 2008, 2008 R2, 2012 and 2012 R2. This means the computer will not use SSL 2.0 to initiate a Client Hello.

So it looks like this in the registry:

clip_image006

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0\Client]

DisabledByDefault =dword:00000001

Just like Ciphers and KeyExchangeAlgorithms, Protocols can be enabled or disabled.

To disable other protocols, select which side of the conversation on which you want to disable the protocol, and add the “Enabled”=dword:00000000 value. The example below disables the SSL 2.0 for the server in addition to the SSL 2.0 for the client.

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0\Client]

DisabledByDefault =dword:00000001 <Default client disabled as I said earlier>

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\SSL 2.0\Server]

Enabled =dword:00000000 <disables SSL 2.0 server side>

clip_image007

After this, you will need to reboot the server. You probably do not want to disable TLS settings. I just added them here for a visual reference.

***For Windows server 2008 R2, if you want to enable Server side TLS 1.1 and 1.2, you MUST create the registry entries as follows:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.1\Server]

DisabledByDefault =dword:00000000

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.2\Server]

DisabledByDefault =dword:00000000

So why would you go through all this trouble to disable protocols and such, anyway? Well, there may be a regulatory requirement that your company’s web servers should only support Federal Information Processing Standards (FIPS) 140-1/2 certified cryptographic algorithms and protocols. Currently, TLS is the only protocol that satisfies such a requirement. Luckily, enforcing this compliant behavior does not require you to manually modify registry settings as described above. You can enforce FIPS compliance via group policy as explained by the following:

The effects of enabling the “System cryptography: Use FIPS compliant algorithms for encryption, hashing, and signing” security setting in Windows XP and in later versions of Windowshttp://support.microsoft.com/kb/811833

The 811833 article talks specifically about the group policy setting below which by default is NOT defined –

Computer Configuration\ Windows Settings \Security Settings \Local Policies\ Security Options

clip_image008

The policy above when applied will modify the following registry locations and their value content.

Be advised that this FipsAlgorithmPolicy information is stored in different ways as well –

Windows 7/2008

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa\FipsAlgorithmPolicy]

“Enabled”=dword:00000000 <Default is disabled>



Windows 2003/XP


Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa]

Fipsalgorithmpolicy =dword:00000000 <Default is disabled>

Enabling this group policy setting effectively disables everything except TLS.

More Examples

Let’s continue with more examples. A vulnerability report may also indicate the presence of other Ciphers it deems to be “weak”.

Below I have built a .reg file that when imported will disable the following Ciphers:

56-bit DES

40-bit RC4

Behold!

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\AES 128]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\AES 256]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\DES 56]

“Enabled”=dword:00000000

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\NULL]

“Enabled”=dword:00000000

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\RC4 128/128]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\RC4 40/128]

“Enabled”=dword:00000000

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\RC4 56/128]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Ciphers\Triple DES 168]

After importing these registry settings, you must reboot the server.

The vulnerability report might also mention that 40-bit DES is enabled, but that would be a false positive because Windows Server 2008 doesn’t support 40-bit DES at all. For example, you might see this in a vulnerability report:

Here is the list of weak SSL ciphers supported by the remote server:

Low Strength Ciphers (< 56-bit key)

SSLv3

EXP-ADH-DES-CBC-SHA Kx=DH(512) Au=None Enc=DES(40) Mac=SHA1 export

TLSv1


EXP-ADH-DES-CBC-SHA Kx=DH(512) Au=None Enc=DES(40) Mac=SHA1 export

If this is reported and it is necessary to get rid of these entries you can also disable the Diffie-Hellman Key Exchange algorithm (another components of the two cipher suites described above — designated with Kx=DH(512)).

To do this, make the following registry changes:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\KeyExchangeAlgorithms\Diffie-Hellman]


“Enabled”=dword:00000000

You have to create the sub-key Diffie-Hellman yourself. Make this change and reboot the server.

This step is NOT advised or required….I am offering it as an option to you to make the vulnerability scanning tool pass the test.

Keep in mind, also, that this will disable any cipher suite that relies upon Diffie-Hellman for key exchange.

You will probably not want to disable ANY cipher suites that rely on Diffie-Hellman. Secure communications such as IPSec and SSL both use Diffie-Hellman for key exchange. If you are running OpenVPN on a Linux/Unix server you are probably using Diffie-Hellman for key exchange. The point I am trying to make here is you should not have to disable the Diffie-Hellman Key Exchange algorithm to satisfy a vulnerability scan.

Advanced Ciphers have arrived!!!

Advanced ciphers were added to Windows 8.1 / Windows Server 2012 R2 computers by KB 2929781, released in April 2014 and again by monthly rollup KB 2919355, released in May 2014

Updated cipher suites were released as part of two fixes.

KB 2919355 for Windows 8.1 and Windows Server 2012 R2 computers

MS14-066 for Windows 7 and Windows 8 clients and Windows Server 2008 R2 and Windows Server 2012 Servers.

While these updates shipped new ciphers, the cipher suite priority ordering could not correctly be updated.

KB 3042058, released Tuesday, March 2015 is a follow up package to correct that issue. This is NOT applicable to 2008 (non R2)

You can set a preference list for which cipher suites the server will negotiate first with a client that supports them.

You can review this MSDN article on how to set the cipher suite prioritization list via GPO: http://msdn.microsoft.com/en-us/library/windows/desktop/bb870930(v=vs.85).aspx#adding__removing__and_prioritizing_cipher_suites

Default location and ordering of Cipher Suites:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Cryptography\Configuration\Local\SSL\0010002

clip_image010

Location of Cipher Suite ordering that is modified by setting this group policy –

Computer Configuration\Administrative Templates\Network\SSL Configuration Settings\SSL Cipher Suite Order

clip_image012

When the SSL Cipher Suite Order group policy is modified and applied successfully it modifies the following location in the registry:

HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\Cryptography\Configuration\SSL\0010002

The Group Policy would dictate the effective cipher suites. Once this policy is applied, the settings here take precedence over what is in the default location. The GPO should override anything else configured on the computer. The Microsoft Schannel team does not support directly manipulating the registry.

Group Policy settings are domain settings configured by a domain administrator and should always have precedence over local settings configured by local administrators.

Below are two cipher suites that were introduced through the June 2016 rollup – https://support.microsoft.com/en-us/kb/3161639
These were added to try and help with interoperability for older applications since RC4 is soon to be deprecated.
TLS_DHE_RSA_WITH_AES_128_CBC_SHA
TLS_DHE_RSA_WITH_AES_256_CBC_SHA

Since these additional cipher suites are now available on clients initiating an SSL connection, any server that has a weak DHE key length under 1024 bits will be rejected by Windows clients.
Below is an explanation of this behavior from the KB that updated Windows 7 clients (Windows 10 has always acted in this manner). https://support.microsoft.com/en-us/kb/3061518
“This security update resolves a vulnerability in Windows. The vulnerability could allow information disclosure when Secure Channel (Schannel) allows the use of a weak Diffie-Hellman ephemeral (DHE) key length of 512 bits in an encrypted Transport Layer Security (TLS) session. Allowing 512-bit DHE keys makes DHE key exchanges weak and vulnerable to various attacks. For an attack to be successful, a server has to support 512-bit DHE key lengths. Windows TLS servers send a default DHE key length of 1,024 bits.”

Being secure is a good thing and depending on your environment, it may be necessary to restrict certain cryptographic algorithms from use. Just make sure you do your diligence about testing these settings. It is also well worth your time to really understand how the security vulnerability software your company just purchased does it’s testing. A double sided network trace will reveal both sides of the client – server hello and what cryptographic algorithms are being offered from each side over the wire.

Jim “Insert cryptic witticism here” Tierney

Updates:

8/29/16: Added information about June 2016 rollup

Does your logon hang after a password change on win 8.1 /2012 R2/win10?

$
0
0

Hi, Linda Taylor here, Senior Escalation Engineer from the Directory Services team in the UK.

I have been working on this issue which seems to be affecting many of you globally on windows 8.1, 2012 R2 and windows 10, so I thought it would be a good idea to explain the issue and workarounds while we continue to work on a proper fix here.

The symptoms are such that after a password change, logon hangs forever on the welcome screen:

clip_image002

How annoying….

The underlying issue is a deadlock between several components including DPAPI and the redirector.

For full details or the issue, workarounds and related fixes check out my post on the ASKPFEPLAT blog here http://blogs.technet.com/b/askpfeplat/archive/2016/01/11/does-your-win-8-1-2012-r2-win10-logon-hang-after-a-password-change.aspx

This is now fixed in the following updates:

Windows 8.1, 2012 R2, 2012 install:

For Windows 10 TH2 build 1511 install:

I hope this helps,

Linda

Previewing Server 2016 TP4: Temporary Group Memberships

$
0
0

Disclaimer: Windows Server 2016 is still in a Technical Preview state – the information contained in this post may become inaccurate in the future as the product continues to evolve. More specifically, there are still issues being ironed out in other parts of Privileged Access Management in Technical Preview 4 for multi-forest deployments.   Watch for more updates as we get closer to general availability!

Hello, Ryan Ries here again with some juicy new Active Directory hotness. Windows Server 2016 is right around the corner, and it’s bringing a ton of new features and improvements with it. Today we’re going to talk about one of the new things you’ll be seeing in Active Directory, which you might see referred to as “expiring links,” or what I like to call “temporary group memberships.”

One of the challenges that every security-conscious Active Directory administrator has faced is how to deal with contractors, vendors, temporary employees and anyone else who needs temporary access to resources within your Active Directory environment. Let’s pretend that your Information Security team wants to perform an automated vulnerability scan of all the devices on your network, and to do this, they will need a service account with Domain Administrator privileges for 5 business days. Because you are a wise AD administrator, you don’t like the idea of this service account that will be authenticating against every device on the network having Domain Administrator privileges, but the CTO of the company says that you have to give the InfoSec team what they want.

(Trust me, this stuff really happens.)

So you strike a compromise, claiming that you will grant this service account temporary membership in the Domain Admins group for 5 days while the InfoSec team conducts their vulnerability scan. Now you could just manually remove the service account from the group after 5 days, but you are a busy admin and you know you’re going to forget to do that. You could also set up a scheduled task to run after 5 days that runs a script that removes the service account from the Domain Admins group, but let’s explore a couple of more interesting options.

The Old Way

One old-school way of accomplishing this is through the use of dynamic objects in 2003 and later. Dynamic objects are automatically deleted (leaving no tombstone behind) after their entryTTL expires. Using this knowledge, our plan is to create a security group called “Temp DA for InfoSec” as a dynamic object with a TTL (time-to-live) of 5 days. Then we’re going to put the service account into the temporary security group. Then we are going to add the temporary security group to the Domain Admins group. The service account is now a member of Domain Admins because of the nested group membership, and once the temporary security group automatically disappears in 5 days, the nested group membership will be broken and the service account will no longer be a member of Domain Admins.

Creating dynamic objects is not as simple as just right-clicking in AD Users & Computer and selecting “New > Dynamic Object,” but it’s still pretty easy if you use ldifde.exe and a simple text file. Below is an example:

clip_image002
Figure 1: Creating a Dynamic Object with ldifde.exe.

dn: cn=Temp DA For InfoSec,ou=Information Security,dc=adatum,dc=com
changeType: add
objectClass: group
objectClass: dynamicObject
entryTTL: 432000
sAMAccountName: Temp DA For InfoSec

In the text file, just supply the distinguished name of the security group you want to create, and make sure it has both the group objectClass and the dynamicObject objectClass. I set the entryTTL to 432000 in the screen shot above, which is 5 days in seconds. Import the object into AD using the following command:
  ldifde -i -f dynamicGroup.txt

Now if you go look at the newly-created group in AD Users & Computers, you’ll see that it has an entryTTL attribute that is steadily counting down to 0:

clip_image004
Figure 2: Dynamic Security Group with an expiry date.

You can create all sorts of objects as Dynamic Objects by the way, not just groups. But enough about that. We came here to see how the situation has improved in Windows Server 2016. I think you’ll like it better than the somewhat convoluted Dynamic Objects solution I just described.

The New Hotness (Windows Server 2016 Technical Preview 4, version 1511.10586.122)

For our next trick, we’ll need to enable the Privileged Access Management Feature in our Windows Server 2016 forest. Another example of an optional feature is the AD Recycle Bin. Keep in mind that just like the AD Recycle Bin, once you enable the Privileged Access Management feature in your forest, you can’t turn it off. This feature also requires a Windows Server 2016 or “Windows Threshold” forest functional level:

clip_image006
Figure 3: This AD Optional Feature requires a Windows Server 2016 or “Windows Threshold” Forest Functional Level.

It’s easy to enable with PowerShell:
Enable-ADOptionalFeature ‘Privileged Access Management Feature’ -Scope ForestOrConfigurationSet -Target adatum.com

Now that you’ve done this, you can start setting time limits on group memberships directly. It’s so easy:
Add-ADGroupMember -Identity ‘Domain Admins’ -Members ‘InfoSecSvcAcct’ -MemberTimeToLive (New-TimeSpan -Days 5)

Now isn’t that a little easier and more straightforward? Our InfoSec service account now has temporary membership in the Domain Admins group for 5 days. And if you want to view the time remaining in a temporary group membership in real time:
Get-ADGroup ‘Domain Admins’ -Property member -ShowMemberTimeToLive

clip_image008
Figure 4: Viewing the time-to-live on a temporary group membership.

So that’s cool, but in addition to convenience, there is a real security benefit to this feature that we’ve never had before. I’d be remiss not to mention that with the new Privileged Access Management feature, when you add a temporary group membership like this, the domain controller will actually constrain the Kerberos TGT lifetime to the shortest TTL that the user currently has. What that means is that if a user account only has 5 minutes left in its Domain Admins membership when it logs on, the domain controller will give that account a TGT that’s only good for 5 more minutes before it has to be renewed, and when it is renewed, the PAC (privilege attribute certificate) will no longer contain that group membership! You can see this in action using klist.exe:

clip_image010
Figure 5: My Kerberos ticket is only good for about 8 minutes because of my soon-to-expire group membership.

Awesome.

Lastly, it’s worth noting that this is just one small aspect of the upcoming Privileged Access Management feature in Windows Server 2016. There’s much more to it, like shadow security principals, bastion forests, new integrations with Microsoft Identity Manager, and more. Read more about what’s new in Windows Server 2016 here.

Until next time,

Ryan “Domain Admin for a Minute” Ries


Updated 3/21/16 with additional text in Disclaimer – “Disclaimer: Server 2016 is still in a Technical Preview state – the information contained in this post may become inaccurate in the future as the product continues to evolve.  More specifically, there are still issues being ironed out in other parts of Privileged Access Management in Technical Preview 4 for multi-forest deployments.   Watch for more updates as we get closer to general availability!”

Are your DCs too busy to be monitored?: AD Data Collector Set solutions for long report compile times or report data deletion

$
0
0

Hi all, Herbert Mauerer here. In this post we’re back to talk about the built-in AD Diagnostics Data collector set available for Active Directory Performance (ADPERF) issues and how to ensure a useful report is generated when your DCs are under heavy load.

Why are my domain controllers so busy you ask? Consider this: Active Directory stands in the center of the Identity Management for many customers. It stores the configuration information for many critical line of business applications. It houses certificate templates, is used to distribute group policy and is the account database among many other things. All sorts of network-based services use Active Directory for authentication and other services.

As mentioned there are many applications which store their configuration in Active Directory, including the details of the user context relative to the application, plus objects specifically created for the use of these applications.

There are also applications that use Active Directory as a store to synchronize directory data. There are products like Forefront Identity Manager (and now Microsoft Identity Manager) where synchronizing data is the only purpose. I will not discuss whether these applications are meta-directories or virtual directories, or what class our Office 365 DirSync belongs to…

One way or the other, the volume and complexity of Active Directory queries has a constant trend of increasing, and there is no end in sight.

So what are my Domain Controllers doing all day?

We get this questions a lot from our customers. It often seems as if the AD Admins are the last to know what kind of load is put onto the domain controllers by scripts, applications and synchronization engines. And they are not made aware of even significant application changes.

But even small changes can have a drastic effect on the DC performance. DCs are resilient, but even the strongest warrior may fall against an overwhelming force.  Think along the lines of “death by a thousand cuts”.  Consider applications or scripts that run non-optimized or excessive queries on many, many clients during or right after logon and it will feel like a distributed DoS. In this scenario, the domain controller may get bogged down due to the enormous workload issued by the clients. This is one of the classic scenarios when it comes to Domain Controller performance problems.

What resources exist today to help you troubleshoot AD Performance scenarios?

We have already discussed the overall topic in this blog, and today many customer requests start with the complaint that the response times are bad and the LSASS CPU time is high. There also is a blog post specifically on the toolset we’ve had since Windows Server 2008. We also updated and brought back the Server Performance Advisor toolset. This toolset is now more targeted at trend analysis and base-lining.  If a video is more your style, Justin Turner revealed our troubleshooting process at Ignite.

The reports generated by this data collection are hugely useful for understanding what is burdening the Domain Controllers. There are fewer cases where DCs are responding slowly, but there is no significant utilization seen. We released a blog on that scenario and also gave you a simple method to troubleshoot long-running LDAP queries at our sister site.  So what’s new with this post?

The AD Diagnostic Data Collector set report “report.html” is missing or compile time is very slow

In recent months, we have seen an increasing number of customers with incomplete Data Collector Set reports. Most of the time, the “report.html” file is missing:

This is a folder where the creation of the report.html file was successful:

image

This folder has exceeded the limits for reporting:

image

Notice the report.html file is missing in the second folder example. Also take note that the ETL and BLG files are bigger. What’s the reason for this?

The Data Collector Set report generation process uncovered:

  • When the data collection ends, the process “tracerpt.exe” is launched to create a report for the folder where the data was collected.
  • “tracerpt.exe” runs with “below normal” priority so it does not get full CPU attention especially if LSASS is busy as well.
  • “tracerpt.exe” runs with one worker thread only, so it cannot take advantage of more than one CPU core.
  • “tracerpt.exe” accumulates RAM usage as it runs.
  • “tracerpt.exe” has six hours to complete a report. If it is not done within this time, the report is terminated.
  • The default settings of the system AD data collector deletes the biggest data set first that exceed the 1 Gigabyte limit. The biggest single file in the reports is typically “Active Directory.etl”.  The report.html file will not get created if this file does not exist.

I worked with a customer recently with a pretty well-equipped Domain Controller (24 server-class CPUs, 256 GB RAM). The customer was kind enough to run a few tests for various report sizes, and found the following metrics:

  • Until the time-out of six hours is hit, “tracerpt.exe” consumes up to 12 GB of RAM.
  • During this time, one CPU core was allocated 100%. If a DC is in a high-load condition, you may want to increase the base priority of “tracerpt.exe” to get the report to complete. This is at the expense of CPU time potentially impacting purpose of said server and in turn clients.
  • The biggest data set that could be completed within the six hours had an “Active Directory.etl” of 3 GB.

If you have lower-spec and busier machines, you shouldn’t expect the same results as this example (On a lower spec machine with a 3 GB ETL file, the report.html file would likely fail to compile within the 6-hour window).

What a bummer, how do you get Performance Logging done then?

Fortunately, there are a number of parameters for a Data Collector Set that come to the rescue. Before you can use any of them you first need one of the more custom Data Collector Sets. You can play with a variety of settings, based on the purpose of the collection.

In Performance Monitor you can create a custom set on the “User Defined” folder by right-clicking it, to bring up the New -> Data Collector Set option in the context menu:

image

This launches a wizard that prompts you for a number of parameters for the new set.

The first thing it wants is a name for the new set:

image

The next step is to select a template. It may be one of the built-in templates or one exported from another computer as an XML file you select through the “Browse” button. In our case, we want to create a clone of “Active Directory Diagnostics”:

image

The next step is optional, and it’s specifies the storage location for the reports. You may want to select a volume with more space or lower IO load than the default volume:

image

There is one more page in the wizard, but there is no reason to make any more changes here. You can click “Finish” on this page.

The default settings are fine for an idle DC, but if you find your ETL files are too large, your reports are not generated, or it takes too long to process the data, you will likely want to make the following configuration changes.

For a real “Big Data Collector Set” we first want to make important changes to the storage strategy of the set that are available in the “Data Manager” log:

image

The most relevant settings are “Resource Policy” and “Maximum Root Path Size”. I recommend starting with the settings as shown below:

image

Notice, I’ve changed the Resource policy from “Delete largest” to “Delete oldest”. I’ve also increased the Maximum root path size from 1024 to 2048 MB.  You can run some reports to learn what the best size settings are for you. You might very well end up using 10 GB or more for your reports.

The second crucial parameter for your custom sets is the run interval for the data collection. It is five minutes by default. You can adjust that in the properties of the collector in the “Stop Condition” tab. In many cases shortening the data collection is a viable step if you see continuous high load:

image

You should avoid going shorter than two minutes, as this is the maximum LDAP query duration by default. (If you have LDAP queries that reach this threshold, they would not show up in a report that is less than two minutes in length.) In fact, I would suggest the minimum interval be set to three minutes.

One very attractive option is automatically restarting the data collection if a certain size of data collection is exceeded. You need to use common sense when you look at the multiple reports, e.g. the ratio of long-running queries is then shown in the logs. But it is definitely better than no report.

If you expect to exceed the 1 GB limit often, you certainly should adjust the total size of collections (Maximum root path size) in the “Data Manager”.

So how do I know how big the collection is while running it?

You can take a look at the folder of the data collection in Explorer, but you will notice it is pretty lazy updating it with the current size of the collection:

image

Explorer only updates the folder if you are doing something with the files. It sounds strange, but attempting to delete a file will trigger an update:

image

Now that makes more sense…

If you see the log is growing beyond your expectations, you can manually stop it before the stop condition hits the threshold you have configured:

image

Of course, you can also start and stop the reporting from a command line using the logman instructions in this post.

Room for improvement

We are aware there is room for improvement to get bigger data sets reported in a shorter time. The good news is that much of these special configuration changes won’t be needed once your DCs are running on Windows Server 2016. We will talk about that in a future post.

Thanks for reading.

Herbert

Setting up Virtual Smart card logon using Virtual TPM for Windows 10 Hyper-V VM Guests

$
0
0
Hello Everyone, my name is Raghav and I’m a Technical Advisor for one of the Microsoft Active Directory support teams. This is my first blog and today I’ll share with you how to configure a Hyper-V environment in order to enable virtual smart card logon to VM guests by leveraging a new Windows 10 feature: virtual Trusted Platform Module (TPM).

Here’s a quick overview of the terminology discussed in this post:
  • Smart cards are physical authentication devices, which improve on the concept of a password by requiring that users actually have their smart card device with them to access the system, in addition to knowing the PIN, which provides access to the smart card.
  • Virtual smart cards (VSCs) emulate the functionality of traditional smart cards, but instead of requiring the purchase of additional hardware, they utilize technology that users already own and are more likely to have with them at all times. Theoretically, any device that can provide the three key properties of smart cards (non-exportability, isolated cryptography, and anti-hammering) can be commissioned as a VSC, though the Microsoft virtual smart card platform is currently limited to the use of the Trusted Platform Module (TPM) chip onboard most modern computers. This blog will mostly concern TPM virtual smart cards.
    For more information, read Understanding and Evaluating Virtual Smart Cards.
  • Trusted Platform Module – (As Christopher Delay explains in his blog) TPM is a cryptographic device that is attached at the chip level to a PC, Laptop, Tablet, or Mobile Phone. The TPM securely stores measurements of various states of the computer, OS, and applications. These measurements are used to ensure the integrity of the system and software running on that system. The TPM can also be used to generate and store cryptographic keys. Additionally, cryptographic operations using these keys take place on the TPM preventing the private keys of certificates from being accessed outside the TPM.
  • Virtualization-based security – The following Information is taken directly from https://technet.microsoft.com/en-us/itpro/windows/keep-secure/windows-10-security-guide
    • One of the most powerful changes to Windows 10 is virtual-based security. Virtual-based security (VBS) takes advantage of advances in PC virtualization to change the game when it comes to protecting system components from compromise. VBS is able to isolate some of the most sensitive security components of Windows 10. These security components aren’t just isolated through application programming interface (API) restrictions or a middle-layer: They actually run in a different virtual environment and are isolated from the Windows 10 operating system itself.
    • VBS and the isolation it provides is accomplished through the novel use of the Hyper V hypervisor. In this case, instead of running other operating systems on top of the hypervisor as virtual guests, the hypervisor supports running the VBS environment in parallel with Windows and enforces a tightly limited set of interactions and access between the environments. Think of the VBS environment as a miniature operating system: It has its own kernel and processes. Unlike Windows, however, the VBS environment runs a micro-kernel and only two processes called trustlets
  • Local Security Authority (LSA) enforces Windows authentication and authorization policies. LSA is a well-known security component that has been part of Windows since 1993. Sensitive portions of LSA are isolated within the VBS environment and are protected by a new feature called Credential Guard.
  • Hypervisor-enforced code integrity verifies the integrity of kernel-mode code prior to execution. This is a part of the Device Guard feature.
VBS provides two major improvements in Windows 10 security: a new trust boundary between key Windows system components and a secure execution environment within which they run. A trust boundary between key Windows system components is enabled though the VBS environment’s use of platform virtualization to isolate the VBS environment from the Windows operating system. Running the VBS environment and Windows operating system as guests on top of Hyper-V and the processor’s virtualization extensions inherently prevents the guests from interacting with each other outside the limited and highly structured communication channels between the trustlets within the VBS environment and Windows operating system.
VBS acts as a secure execution environment because the architecture inherently prevents processes that run within the Windows environment – even those that have full system privileges – from accessing the kernel, trustlets, or any allocated memory within the VBS environment. In addition, the VBS environment uses TPM 2.0 to protect any data that is persisted to disk. Similarly, a user who has access to the physical disk is unable to access the data in an unencrypted form.
clip_image002[4]
VBS requires a system that includes:
  • Windows 10 Enterprise Edition
  • A-64-bit processor
  • UEFI with Secure Boot
  • Second-Level Address Translation (SLAT) technologies (for example, Intel Extended Page Tables [EPT], AMD Rapid Virtualization Indexing [RVI])
  • Virtualization extensions (for example, Intel VT-x, AMD RVI)
  • I/O memory management unit (IOMMU) chipset virtualization (Intel VT-d or AMD-Vi)
  • TPM 2.0
Note: TPM 1.2 and 2.0 provides protection for encryption keys that are stored in the firmware. TPM 1.2 is not supported on Windows 10 RTM (Build 10240); however, it is supported in Windows 10, Version 1511 (Build 10586) and later.
Among other functions, Windows 10 uses the TPM to protect the encryption keys for BitLocker volumes, virtual smart cards, certificates, and the many other keys that the TPM is used to generate. Windows 10 also uses the TPM to securely record and protect integrity-related measurements of select hardware.



Now that we have the terminology clarified, let’s talk about how to set this up.


Setting up Virtual TPM
First we will ensure we meet the basic requirements on the Hyper-V host.
On the Hyper-V host, launch msinfo32 and confirm the following values:

The BIOS Mode should state “UEFI”.

clip_image001
Secure Boot State should be On.
clip_image002

Next, we will enable VBS on the Hyper-V host.
  1. Open up the Local Group Policy Editor by running gpedit.msc.
  2. Navigate to the following settings: Computer Configuration, Administrative Templates, System, Device Guard. Double-click Turn On Virtualization Based Security. Set the policy to Enabled, click OK,
clip_image004

Now we will enable Isolated User Mode on the Hyper-V host.
1. To do that, go to run type appwiz.cpl on the left pane find Turn Windows Features on or off.
Check Isolated User Mode, click OK, and then reboot when prompted.
clip_image006

This completes the initial steps needed for the Hyper-V host.


Now we will enable support for virtual TPM on your Hyper-V VM guest
Note: Support for Virtual TPM is only included in Generation 2 VMs running Windows 10.
To enable this on your Windows 10 generation 2 VM. Open up the VM settings and review the configuration under the Hardware, Security section. Enable Secure Boot and Enable Trusted Platform Module should both be selected.
clip_image008

That completes the Virtual TPM part of the configuration.  We will now work on working on virtual Smart Card configuration.

Setting up Virtual Smart Card
In the next section, we create a certificate template so that we can request a certificate that has the required parameters needed for Virtual Smart Card logon.
These steps are adapted from the following TechNet article: https://technet.microsoft.com/en-us/library/dn579260.aspx

Prerequisites and Configuration for Certificate Authority (CA) and domain controllers
  • Active Directory Domain Services
  • Domain controllers must be configured with a domain controller certificate to authenticate smartcard users. The following article covers Guidelines for enabling smart card logon: http://support.microsoft.com/kb/281245
  • An Enterprise Certification Authority running on Windows Server 2012 or Windows Server 2012 R2. Again, Chris’s blog covers neatly on how to setup a PKI environment.
  • Active Directory must have the issuing CA in the NTAuth store to authenticate users to active directory.
Create the certificate template
1. On the CA console (certsrv.msc) right click on Certificate Template and select Manage
clip_image010

2. Right-click the Smartcard Logon template and then click Duplicate Template
clip_image012

3. On the Compatibility tab, set the compatibility settings as below
clip_image014

4. On the Request Handling tab, in the Purpose section, select Signature and smartcard logon from the drop down menu
clip_image016

5. On the Cryptography Tab, select the Requests must use on of the following providers radio button and then select the Microsoft Base Smart Card Crypto Provider option.
clip_image018

Optionally, you can use a Key Storage Provider (KSP). Choose the KSP, under Provider Category select Key Storage Provider. Then select the Requests must use one of the following providers radio button and select the Microsoft Smart Card Key Storage Provider option.
clip_image020

6. On the General tab: Specify a name, such as TPM Virtual Smart Card Logon. Set the validity period to the desired value and choose OK


7. Navigate to Certificate Templates. Right click on Certificate Templates and select New, then Certificate Template to Issue.  Select the new template you created in the prior steps.


clip_image022
Note that it usually takes some time for this certificate to become available for issuance.


Create the TPM virtual smart card

Next we’ll create a virtual Smart Card on the Virtual Machine by using the Tpmvscmgr.exe command-line tool.

1. On the Windows 10 Gen 2 Hyper-V VM guest, open an Administrative Command Prompt and run the following command:
tpmvsmgr.exe create /name myVSC /pin default /adminkey random /generate
clip_image024
You will be prompted for a pin.  Enter at least eight characters and confirm the entry.  (You will need this pin in later steps)


Enroll for the certificate on the Virtual Smart Card Certificate on Virtual Machine.
1. In certmgr.msc, right click Certificates, click All Tasks then Request New Certificate.
clip_image025

2. On the certificate enrollment select the new template you created earlier.
clip_image027

3. It will prompt for the PIN associated with the Virtual Smart Card. Enter the PIN and click OK.
clip_image029

4. If the request completes successfully, it will display Certificate Installation results page
clip_image031

5. On the virtual machine select sign-in options and select security device and enter the pin
clip_image033

That completes the steps on how to deploy Virtual Smart Cards using a virtual TPM on virtual machines.  Thanks for reading!

Raghav Mahajan


The Version Store Called, and They’re All Out of Buckets

$
0
0

Hello, Ryan Ries back at it again with another exciting installment of esoteric Active Directory and ESE database details!

I think we need to have another little chat about something called the version store.

The version store is an inherent mechanism of the Extensible Storage Engine and a commonly seen concept among databases in general. (ESE is sometimes referred to as Jet Blue. Sometimes old codenames are so catchy that they just won’t die.) Therefore, the following information should be relevant to any application or service that uses an ESE database (such as Exchange,) but today I’m specifically focused on its usage as it pertains to Active Directory.

The version store is one of those details that the majority of customers will never need to think about. The stock configuration of the version store for Active Directory will be sufficient to handle any situation encountered by 99% of AD administrators. But for that 1% out there with exceptionally large and/or busy Active Directory deployments, (or for those who make “interesting” administrative choices,) the monitoring and tuning of the version store can become a very important topic. And quite suddenly too, as replication throughout your environment grinds to a halt because of version store exhaustion and you scramble to figure out why.

The purpose of this blog post is to provide up-to-date (as of the year 2016) information and guidance on the version store, and to do it in a format that may be more palatable to many readers than sifting through reams of old MSDN and TechNet documentation that may or may not be accurate or up to date. I can also offer more practical examples than you would probably get from straight technical documentation. There has been quite an uptick lately in the number of cases we’re seeing here in Support that center around version store exhaustion. While the job security for us is nice, knowing this stuff ahead of time can save you from having to call us and spend lots of costly support hours.

Version Store: What is it?

As mentioned earlier, the version store is an integral part of the ESE database engine. It’s an area of temporary storage in memory that holds copies of objects that are in the process of being modified, for the sake of providing atomic transactions. This allows the database to roll back transactions in case it can’t commit them, and it allows other threads to read from a copy of the data while it’s in the process of being modified. All applications and services that utilize an ESE database use version store to some extent. The article “How the Data Store Works” describes it well:

“ESE provides transactional views of the database. The cost of providing these views is that any object that is modified in a transaction has to be temporarily copied so that two views of the object can be provided: one to the thread inside that transaction and one to threads in other transactions. This copy must remain as long as any two transactions in the process have different views of the object. The repository that holds these temporary copies is called the version store. Because the version store requires contiguous virtual address space, it has a size limit. If a transaction is open for a long time while changes are being made (either in that transaction or in others), eventually the version store can be exhausted. At this point, no further database updates are possible.”

When Active Directory was first introduced, it was deployed on machines with a single x86 processor with less than 4 GB of RAM supporting NTDS.DIT files that ranged between 2MB and a few hundred MB. Most of the documentation you’ll find on the internet regarding the version store still has its roots in that era and was written with the aforementioned hardware in mind. Today, things like hardware refreshes, OS version upgrades, cloud adoption and an improved understanding of AD architecture are driving massive consolidation in the number of forests, domains and domain controllers in them, DIT sizes are getting bigger… all while still relying on default configuration values from the Windows 2000 era.

The number-one killer of version store is long-running transactions. Transactions that tend to be long-running include, but are not limited to:

– Deleting a group with 100,000 members
– Deleting any object, not just a group, with 100,000 or more forward/back links to clean
– Modifying ACLs in Active Directory on a parent container that propagate down to many thousands of inheriting child objects
– Creating new database indices
– Having underpowered or overtaxed domain controllers, causing transactions to take longer in general
– Anything that requires boat-loads of database modification
– Large SDProp and garbage collection tasks
– Any combination thereof

I will show some examples of the errors that you would see in your event logs when you experience version store exhaustion in the next section.

Monitoring Version Store Usage

To monitor version store usage, leverage the Performance Monitor (perfmon) counter:

‘\\dc01\Database ==> Instances(lsass/NTDSA)\Version buckets allocated’

image
(Figure 1: The ‘Version buckets allocated’ perfmon counter.)

The version store divides the amount of memory that it has been given into “buckets,” or “pages.” Version store pages need not (and in AD, they do not) equal the size of database pages elsewhere in the database. We’ll get into the exact size of these buckets in a minute.

During typical operation, when the database is not busy, this counter will be low. It may even be zero if the database really just isn’t doing anything. But when you perform one of those actions that I mentioned above that qualify as “long-running transactions,” you will trigger a spike in the version store usage. Here is an example of me deleting a group that contains 200,000 members, on a DC running 2012 R2 with 1 64bit CPU:

image(Figure 2: Deleting a group containing 200k members on a 2012 R2 DC with 1 64bit CPU.)

The version store spikes to 5332 buckets allocated here, seconds after I deleted the group, but as long as the DC recovers and falls back down to nominal levels, you’ll be alright. If it stays high or even maxed out for extended periods of time, then no more database transactions for you. This includes no more replication. This is just an example using the common member/memberOf relationship, but any linked-value attribute relationship can cause this behavior. (I’ve talked a little about linked value attributes before here.) There are plenty of other types of objects that may invoke this same kind of behavior, such as deleting an RODC computer object, and then its msDs-RevealedUsers links must be processed, etc..

I’m not saying that deleting a group with fewer than 200K members couldn’t also trigger version store exhaustion if there are other transactions taking place on your domain controller simultaneously or other extenuating circumstances. I’ve seen transactions involving as few as 70K linked values cause major problems.

After you delete an object in AD, and the domain controller turns it into a tombstone, each domain controller has to process the linked-value attributes of that object to maintain the referential integrity of the database. It does this in “batches,” usually 1000 or 10,000 depending on Windows version and configuration. This was only very recently documented here. Since each “batch” of 1000 or 10,000 is considered a single transaction, a smaller batch size will tend to complete faster and thus require less version store usage. (But the overall job will take longer.)

An interesting curveball here is that having the AD Recycle Bin enabled will defer this action by an msDs-DeletedObjectLifetime number of days after an object is deleted, since that’s the appeal behind the AD Recycle Bin – it allows you to easily restore deleted objects with all their links intact. (More detail on the AD Recycle Bin here.)

When you run out of version storage, no other database transactions can be committed until the transaction or transactions that are causing the version store exhaustion are completed or rolled back. At this point, most people start rebooting their domain controllers, and this may or may not resolve the immediate issue for them depending on exactly what’s going on. Another thing that may alleviate this issue is offline defragmentation of the database. (Or reducing the links batch size, or increasing the version store size – more on that later.) Again, we’re usually looking at 100+ gigabyte DITs when we see this kind of issue, so we’re essentially talking about pushing the limits of AD. And we’re also talking about hours of downtime for a domain controller while we do that offline defrag and semantic database analysis.

Here, Active Directory is completely tapping out the version store. Notice the plateau once it has reached its max:

image(Figure 3: Version store being maxed out at 13078 buckets on a 2012 R2 DC with 1 64bit CPU.)

So it has maxed out at 13,078 buckets.

When you hit this wall, you will see events such as these in your event logs:

Log Name: Directory Service
Source: Microsoft-Windows-ActiveDirectory_DomainService
Date: 5/16/2016 5:54:52 PM
Event ID: 1519
Task Category: Internal Processing
Level: Error
Keywords: Classic
User: S-1-5-21-4276753195-2149800008-4148487879-500
Computer: DC01.contoso.com
Description:
Internal Error: Active Directory Domain Services could not perform an operation because the database has run out of version storage.

And also:

Log Name: Directory Service
Source: NTDS ISAM
Date: 5/16/2016 5:54:52 PM
Event ID: 623
Task Category: (14)
Level: Error
Keywords: Classic
User: N/A
Computer: DC01.contoso.com
Description:
NTDS (480) NTDSA: The version store for this instance (0) has reached its maximum size of 408Mb. It is likely that a long-running transaction is preventing cleanup of the version store and causing it to build up in size. Updates will be rejected until the long-running transaction has been completely committed or rolled back.

The peculiar “408Mb” figure that comes along with that last event leads us into the next section…

How big is the Version Store by default?

The “How the Data Store Works” article that I linked to earlier says:

“The version store has a size limit that is the lesser of the following: one-fourth of total random access memory (RAM) or 100 MB. Because most domain controllers have more than 400 MB of RAM, the most common version store size is the maximum size of 100 MB.”

Incorrect.

And then you have other articles that have even gone to print, such as this one, that say:

“Typically, the version store is 25 percent of the physical RAM.”

Extremely incorrect.

What about my earlier question about the bucket size? Well if you consulted this KB article you would read:

The value for the setting is the number of 16KB memory chunks that will be reserved.”

Nope, that’s wrong.

Or if I go to the MSDN documentation for ESE:

“JET_paramMaxVerPages
This parameter reserves the requested number of version store pages for use by an instance.

Each version store page as configured by this parameter is 16KB in size.”

Not true.

The pages are not 16KB anymore on 64bit DCs. And the only time that the “100MB” figure was ever even close to accurate was when domain controllers were 32bit and had 1 CPU. But today, domain controllers are 64bit and have lots of CPUs. Both version store bucket size and number of version store buckets allocated by default both double based on whether your domain controller is 32bit or 64bit. And the figure also scales a little bit based on how many CPUs are in your domain controller.

So without further ado, here is how to calculate the actual number of buckets that Active Directory will allocate by default:

(2 * (3 * (15 + 4 + 4 * #CPUs)) + 6400) * PointerSize / 4

Pointer size is 4 if you’re using a 32bit processor, and 8 if you’re using a 64bit processor.

And secondly, version store pages are 16KB if you’re on a 32bit processor, and 32KB if you’re on a 64bit processor. So using a 64bit processor effectively quadruples the default size of your AD version store. To convert number of buckets allocated into bytes for a 32bit processor:

(((2 * (3 * (15 + 4 + 4 * 1)) + 6400) * 4 / 4) * 16KB) / 1MB

And for a 64bit processor:

(((2 * (3 * (15 + 4 + 4 * 1)) + 6400) * 8 / 4) * 32KB) / 1MB

So using the above formulae, the version store size for a single-core, 64bit DC would be ~408MB, which matches that event ID 623 we got from ESE earlier. It also conveniently matches 13078 * 32KB buckets, which is where we plateaued with our perfmon counter earlier.

If you had a 4-core, 64bit domain controller, the formula would come out to ~412MB, and you will see this line up with the event log event ID 623 on that machine. When a 4-core, Windows 2008 R2 domain controller with default configuration runs out of version store:

Log Name:      Directory Service
Source:        NTDS ISAM
Date:          5/15/2016 1:18:25 PM
Event ID:      623
Task Category: (14)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      dc02.fabrikam.com
Description:
NTDS (476) NTDSA: The version store for this instance (0) has reached its maximum size of 412Mb. It is likely that a long-running transaction is preventing cleanup of the version store and causing it to build up in size. Updates will be rejected until the long-running transaction has been completely committed or rolled back.

The version store size for a single-core, 32bit DC is ~102MB. This must be where the original “100MB” adage came from. But as you can see now, that information is woefully outdated.

The 6400 number in the equation comes from the fact that 6400 is the absolute, hard-coded minimum number of version store pages/buckets that AD will give you. Turns out that’s about 100MB, if you assumed 16KB pages, or 200MB if you assume 32KB pages. The interesting side-effect from this is that the documented “EDB max ver pages (increment over the minimum)” registry entry, which is the supported way of increasing your version store size, doesn’t actually have any effect unless you set it to some value greater than 6400 decimal. If you set that registry key to something less than 6400, then it will just get overridden to 6400 when AD starts. But if you set that registry entry to, say, 9600 decimal, then your version store size calculation will be:

(((2 *(3 * (15 + 4 + 4 * 1)) + 9600) * 8 / 4) * 32KB) / 1MB = 608.6MB

For a 64bit, 1-core domain controller.

So let’s set those values on a DC, then run up the version store, and let’s get empirical up in here:

image(Figure 4: Version store exhaustion at 19478 buckets on a 2012 R2 DC with 1 64bit CPU.)

(19478 * 32KB) / 1MB = 608.7MB

And wouldn’t you know it, the event log now reads:

image(Figure 5: The event log from the previous version store exhaustion, showing the effect of setting the “EDB max ver pages (increment over the minimum)” registry value to 9600.)

Here’s a table that shows version store sizes based on the “EDB max ver pages (increment over the minimum)” value and common CPU counts:

Buckets

1 CPU

2 CPUs

4 CPUs

8 CPUs

16 CPUs

6400

(The default)

x64: 410 MB

x86: 103 MB

x64: 412 MB

x86: 103 MB

x64: 415 MB

x86: 104 MB

x64: 421 MB

x86: 105 MB

x64: 433 MB

x86: 108 MB

9600

x64: 608 MB

x86: 152 MB

x64: 610 MB

x86: 153 MB

x64: 613 MB

x86: 153 MB

x64: 619MB

x86: 155 MB

x64: 631 MB

x86: 158 MB

12800

x64: 808 MB

x86: 202 MB

x64: 810 MB

x86: 203 MB

x64: 813 MB

x86: 203 MB

x64: 819 MB

x86: 205 MB

x64: 831 MB

x86: 208 MB

16000

x64: 1008 MB

x86: 252 MB

x64: 1010 MB

x86: 253 MB

x64: 1013 MB

x86: 253 MB

x64: 1019 MB

x86: 255 MB

x64: 1031 MB

x86: 258 MB

19200

x64: 1208 MB

x86: 302 MB

x64: 1210 MB

x86: 303 MB

x64: 1213 MB

x86: 303 MB

x64: 1219 MB

x86: 305 MB

x64: 1231 MB

x86: 308 MB

Sorry for the slight rounding errors – I just didn’t want to deal with decimals. As you can see, the number of CPUs in your domain controller only has a slight effect on the version store size. The processor architecture, however, makes all the difference. Good thing absolutely no one uses x86 DCs anymore, right?

Now I want to add a final word of caution.

I want to make it clear that we recommend changing the “EDB max ver pages (increment over the minimum)” only when necessary; when the event ID 623s start appearing. (If it ain’t broke, don’t fix it.) I also want to reiterate the warnings that appear on the support KB, that you must not set this value arbitrarily high, you should increment this setting in small (50MB or 100MB increments,) and that if setting the value to 19200 buckets still does not resolve your issue, then you should contact Microsoft Support. If you are going to change this value, it is advisable to change it consistently across all domain controllers, but you must also carefully consider the processor architecture and available memory on each DC before you change this setting. The version store requires a contiguous allocation of memory – precious real-estate – and raising the value too high can prevent lsass from being able to perform other work. Once the problem has subsided, you should then return this setting back to its default value.

In my next post on this topic, I plan on going into more detail on how one might actually troubleshoot the issue and track down the reason behind why the version store exhaustion is happening.

Conclusions

There is a lot of old documentation out there that has misled many an AD administrator on this topic. It was essentially accurate at the time it was written, but AD has evolved since then. I hope that with this post I was able to shed more light on the topic than you probably ever thought was necessary. It’s an undeniable truth that more and more of our customers continue to push the limits of AD beyond that which was originally conceived. I also want to remind the reader that the majority of the information in this article is AD-specific. If you’re thinking about Exchange or Certificate Services or Windows Update or DFSR or anything else that uses an ESE database, then you need to go figure out your own application-specific details, because we don’t use the same page sizes or algorithms as those guys.

I hope this will be valuable to those who find themselves asking questions about the ESE version store in Active Directory.

With love,

Ryan “Buckets of Fun” Ries


Deploying Group Policy Security Update MS16-072 \ KB3163622

$
0
0

My name is Ajay Sarkaria & I work with the Windows Supportability team at Microsoft. There have been many questions on deploying the newly released security update MS16-072.

This post was written to provide guidance and answer questions needed by administrators to deploy the newly released security update, MS16-072 that addresses a vulnerability. The vulnerability could allow elevation of privilege if an attacker launches a man-in-the-middle (MiTM) attack against the traffic passing between a domain controller and the target machine on domain-joined Windows computers.

The table below summarizes the KB article number for the relevant Operating System:

Article # Title Context / Synopsis
MSKB 3163622 MS16-072: Security Updates for Group Policy: June 14, 2016 Main article for MS16-072
MSKB 3159398 MS16-072: Description of the security update for Group Policy: June 14, 2016 MS16-072 for Windows Vista / Windows Server 2008, Window 7 / Windows Server 2008 R2, Windows Server 2012, Window 8.1 / Windows Server 2012 R2
MSKB 3163017 Cumulative update for Windows 10: June 14, 2016 MS16-072 For Windows 10 RTM
MSKB 3163018 Cumulative update for Windows 10 Version 1511 and Windows Server 2016 Technical Preview 4: June 14, 2016 MS16-072 For Windows 10 1511 + Windows Server 2016 TP4
MSKB 3163016 Cumulative Update for Windows Server 2016 Technical Preview 5: June 14 2016 MS16-072 For Windows Server 2016 TP5
TN: MS16-072 Microsoft Security Bulletin MS16-072 – Important Overview of changes in MS16-072
What does this security update change?

The most important aspect of this security update is to understand the behavior changes affecting the way User Group Policy is applied on a Windows computer. MS16-072 changes the security context with which user group policies are retrieved. Traditionally, when a user group policy is retrieved, it is processed using the user’s security context.

After MS16-072 is installed, user group policies are retrieved by using the computer’s security context. This by-design behavior change protects domain joined computers from a security vulnerability.

When a user group policy is retrieved using the computer’s security context, the computer account will now need “read” access to retrieve the group policy objects (GPOs) needed to apply to the user.

Traditionally, all group policies were read if the “user” had read access either directly or being part of a domain group e.g. Authenticated Users

What do we need to check before deploying this security update?

As discussed above, by default “Authenticated Users” have “Read” and “Apply Group Policy” on all Group Policy Objects in an Active Directory Domain.

Below is a screenshot from the Default Domain Policy:

If permissions on any of the Group Policy Objects in your active Directory domain have not been modified, are using the defaults, and as long as Kerberos authentication is working fine in your Active Directory forest (i.e. there are not Kerberos errors visible in the system event log on client computers while accessing domain resources), there is nothing else you need to make sure before you deploy the security update.

In some deployments, administrators may have removed the “Authenticated Users” group from some or all Group Policy Objects (Security filtering, etc.)

In such cases, you will need to make sure of the following before you deploy the security update:

  1. Check if “Authenticated Users” group read permissions were removed intentionally by the admins. If not, then you should probably add those back. For example, if you do not use any security filtering to target specific group policies to a set of users, you could add “Authenticated Users” back with the default permissions as shown in the example screenshot above.
  2. If the “Authenticated Users” permissions were removed intentionally (security filtering, etc), then as a result of the by-design change in this security update (i.e. to now use the computer’s security context to retrieve user policies), you will need to add the computer account retrieving the group policy object (GPO) to “Read” Group Policy (and not “Apply group policy“).

    Example Screenshot:

In the above example screenshot, let’s say an Administrator wants “User-Policy” (Name of the Group Policy Object) to only apply to the user with name “MSFT Ajay” and not to any other user, then the above is how the Group Policy would have been filtered for other users. “Authenticated Users” has been removed intentionally in the above example scenario.

Notice that no other user or group is included to have “Read” or “Apply Group Policy” permissions other than the default Domain Admins and Enterprise Admins. These groups do not have “Apply Group Policy” by default so the GPO would not apply to the users of these groups & apply only to user “MSFT Ajay”

What will happen if there are Group Policy Objects (GPOs) in an Active Directory domain that are using security filtering as discussed in the example scenario above?

Symptoms when you have security filtering Group Policy Objects (GPOs) like the above example and you install the security update MS16-072:

  • Printers or mapped drives assigned through Group Policy Preferences disappear.
  • Shortcuts to applications on users’ desktop are missing
  • Security filtering group policy does not process anymore
  • You may see the following change in gpresult: Filtering: Not Applied (Unknown Reason)
  • If you are using Folder Redirection and the Folder Redirection group policy removal option is set to “Redirect the folder back to the user profile location when policy is removed,” the redirected folders are moved back to the client machine after installing this security update
What is the Resolution?

Simply adding the “Authenticated Users” group with the “Read” permissions on the Group Policy Objects (GPOs) should be sufficient. Domain Computers are part of the “Authenticated Users” group. “Authenticated Users” have these permissions on any new Group Policy Objects (GPOs) by default. Again, the guidance is to add just “Read” permissions and not “Apply Group Policy” for “Authenticated Users”

What if adding Authenticated Users with Read permissions is not an option?

If adding “Authenticated Users” with just “Read” permissions is not an option in your environment, then you will need to add the “Domain Computers” group with “Read” Permissions. If you want to limit it beyond the Domain Computers group: Administrators can also create a new domain group and add the computer accounts to the group so you can limit the “Read Access” on a Group Policy Object (GPO). However, computers will not pick up membership of the new group until a reboot. Also keep in mind that with this security update installed, this additional step is only required if the default “Authenticated Users” Group has been removed from the policy where user settings are applied.

Example Screenshots:

Now in the above scenario, after you install the security update, as the user group policy needs to be retrieved using the system’s security context, (domain joined system being part of the “Domain Computers” security group by default), the client computer will be able to retrieve the user policies required to be applied to the user and the same will be processed successfully.

How to identify GPOs with issues:

In case you have already installed the security update and need to identify Group Policy Objects (GPOs) that are affected, the easy way is just to do a simple gpupdate /force on a Windows client computer and then run the gpresult /h new-report.html -> Open the new-report.html and review for any errors like: “Reason Denied: Inaccessible, Empty or Disabled”

What if there are lot of GPOs?

A script is available which can detect all Group Policy Objects (GPOs) in your domain which may have the “Authenticated Users” missing “Read” Permissions
You can get the script from here: https://gallery.technet.microsoft.com/Powershell-script-to-cc281476

Pre-Reqs:

  • The script can run only on Windows 7 and above Operating Systems which have the RSAT or GPMC installed or Domain Controllers running Windows Server 2008 R2 and above
  • The script works in a single domain scenario.
  • The script will detect all GPOs in your domain (Not Forest) which are missing “Authenticated Users” permissions & give the option to add “Authenticated Users” with “Read” Permissions (Not Apply Group Policy). If you have multiple domains in your Active Directory Forest, you will need to run this for each domain.
    • Domain Computers are part of the Authenticated Users group
  • The script can only add permissions to the Group Policy Objects (GPOs) in the same domain as the context of the current user running the script. In a multi domain forest, you must run it in the context of the Domain Admin of the other domain in your forest.

Sample Screenshots when you run the script:

In the first sample screenshot below, running the script detects all Group Policy Objects (GPOs) in your domain which has the “Authenticated Users” missing the Read Permission.

image

If you hit “Y”, you will see the below message:

image

What if there are AGPM managed Group Policy Objects (GPOs)?

Follow the steps below to add “Authenticated Users” with Read Permissions:

To change the permissions for all managed GPO’s and add Authenticated Users Read permission follow these steps:

Re-import all Group Policy Objects (GPOs) from production into the AGPM database. This will ensure the latest copy of production GPO’s.

clip_image002[1]

clip_image004[1]

Add either “Authenticated Users” or “Domain Computers” the READ permission using the Production Delegation Tab by selecting the security principal, granting the “READ” role then clicking “OK”

clip_image006

Grant the selected security principal the “Read” role.

clip_image008

Delegation tab depicting Authenticated Users having the READ permissions.

clip_image010

Select and Deploy GPOs again:
Note:  To modify permissions on multiple AGPM-managed GPOs, use shift+click or ctrl+click to select multiple GPO’s at a time then deploy them in a single operation.
CTRL_A does not select all policies.

clip_image012

clip_image014

The targeted GPO now have the new permissions when viewed in AD:

clip_image016

Below are some Frequently asked Questions we have seen:

Frequently Asked Questions (FAQs):

Q1) Do I need to install the fix on only client OS? OR do I also need to install it on the Server OS?

A1) It is recommended you patch Windows and Windows Server computers which are running Windows Vista, Windows Server 2008 and newer Operating Systems (OS), regardless of SKU or role, in your entire domain environment. These updates only change behavior from a client (as in “client-server distributed system architecture”) standpoint, but all computers in a domain are “clients” to SYSVOL and Group Policy; even the Domain Controllers (DCs) themselves

Q2) Do I need to enable any registry settings to enable the security update?

A2) No, this security update will be enabled when you install the MS16-072 security update, however you need to check the permissions on your Group Policy Objects (GPOs) as explained above

Q3) What will change in regard to how group policy processing works after the security update is installed?

A3) To retrieve user policy, the connection to the Windows domain controller (DC) prior to the installation of MS16-072 is done under the user’s security context. With this security update installed, instead of user’s security context, Windows group policy clients will now force local system’s security context, therefore forcing Kerberos authentication

Q4) We already have the security update MS15-011 & MS15-014 installed which hardens the UNC paths for SYSVOL & NETLOGON & have the following registry keys being pushed using group policy:

  • RequirePrivacy=1
  • RequireMutualAuthentication=1
  • RequireIntegrity=1

Should the UNC Hardening security update with the above registry settings not take care of this vulnerability when processing group policy from the SYSVOL?

A4) No. UNC Hardening alone will not protect against this vulnerability. In order to protect against this vulnerability, one of the following scenarios must apply: UNC Hardened access is enabled for SYSVOL/NETLOGON as suggested, and the client computer is configured to require Kerberos FAST Armoring

– OR –

UNC Hardened Access is enabled for SYSVOL/NETLOGON, and this particular security update (MS16-072 \ KB3163622) is installed

Q5) If we have security filtering on Computer objects, what change may be needed after we install the security update?

A5) Nothing will change in regard to how Computer Group Policy retrieval and processing works

Q6) We are using security filtering for user objects and after installing the update, group policy processing is not working anymore

A6) As noted above, the security update changes the way user group policy settings are retrieved. The reason for group policy processing failing after the update is installed is because you may have removed the default “Authenticated Users” group from the Group Policy Object (GPO). The computer account will now need “read” permissions on the Group Policy Object (GPO). You can add “Domain Computers” group with “Read” permissions on the Group Policy Object (GPO) to be able to retrieve the list of GPOs to download for the user

Example Screenshot as below:

Q7) Will installing this security update impact cross forest user group policy processing?

A7) No, this security update will not impact cross forest user group policy processing. When a user from one forest logs onto a computer in another forest and the group policy setting “Allow Cross-Forest User Policy and Roaming User Profiles” is enabled, the user group policy during the cross forest logon will be retrieved using the user’s security context.

Q8) Is there a need to specifically add “Domain Computers” to make user group policy processing work or adding “Authenticated Users” with just read permissions should suffice?

A8) Yes, just adding “Authenticated Users” with Read permissions should suffice. If you already have “Authenticated Users” added with at-least read permissions on a GPO, there is no further action required. “Domain Computers” are by default part of the “Authenticated Users” group & user group policy processing will continue to work. You only need to add “Domain Computers” to the GPO with read permissions if you do not want to add “Authenticated Users” to have “Read”

Thanks,

Ajay Sarkaria

Supportability Program Manager – Windows

Edits:
6/29/16 – added script link and prereqs
7/11/16 – added information about AGPM
8/16/16 – added note about folder redirection

Access-Based Enumeration (ABE) Concepts (part 1 of 2)

$
0
0

Hello everyone, Hubert from the German Networking Team here.  Today I want to revisit a topic that I wrote about in 2009: Access-Based Enumeration (ABE)

This is the first part of a 2-part Series. This first part will explain some conceptual things around ABE.  The second part will focus on diagnostic and troubleshooting of ABE related problems.  The second post is here.

Access-Based Enumeration has existed since Windows Server 2003 SP1 and has not change in any significant form since my Blog post in 2009. However, what has significantly changed is its popularity.

With its integration into V2 (2008 Mode) DFS Namespaces and the increasing demand for data privacy, it became a tool of choice for many architects. However, the same strict limitations and performance impact it had in Windows Server 2003 still apply today. With this post, I hope to shed some more light here as these limitations and the performance impact are either unknown or often ignored. Read on to gain a little insight and background on ABE so that you:

  1. Understand its capabilities and limitations
  2. Gain the background knowledge needed for my next post on how to troubleshoot ABE

Two things to keep in mind:

  • ABE is not a security feature (it’s more of a convenience feature)
  • There is no guarantee that ABE will perform well under all circumstances. If performance issues come up in your deployment, disabling ABE is a valid solution.

So without any further ado let’s jump right in:

What is ABE and what can I do with it?

From the TechNet topic:

“Access-based enumeration displays only the files and folders that a user has permissions to access. If a user does not have Read (or equivalent) permissions for a folder, Windows hides the folder from the user’s view. This feature is active only when viewing files and folders in a shared folder; it is not active when viewing files and folders in the local file system.”

Note that ABE has to check the user’s permissions at the time of enumeration and filter out files and folders they don’t have Read permissions to. Also note that this filtering only applies if the user is attempting to access the share via SMB versus simply browsing the same folder structure in the local file system.

For example, let’s assume you have an ABE enabled file server share with 500 files and folders, but a certain user only has read permissions to 5 of those folders. The user is only able to view 5 folders when accessing the share over the network. If the user logons to this server and browses the local file system, they will see all of the files and folders.

In addition to file server shares, ABE can also be used to filter the links in DFS Namespaces.

With V2 Namespaces DFSN got the capability to store permissions for each DFSN link, and apply those permissions to the local file system of each DFSN Server.

Those NTFS permissions are then used by ABE to filter directory enumerations against the DFSN root share thus removing DFSN links from the results sent to the client.

Therefore, ABE can be used to either hide sensitive information in the link/folder names, or to increase usability by hiding hundreds of links/folders the user does not have access to.

How does it work?

The filtering happens on the file server at the time of the request.

Any Object (File / Folder / Shortcut / Reparse Point / etc.) where the user has less than generic read permissions is omitted in the response by the server.

Generic Read means:

  • List Folder / Read Data
  • Read Attributes
  • Read Extended Attributes
  • Read Permissions

If you take any of these permissions away, ABE will hide the object.

So you could create a scenario (i.e. remove the Read Permission permission) where the object is hidden from the user, but he/she could still open/read the file or folder if the user knows its name.

That brings us to the next important conceptual point we need to understand:

ABE does not do access control.

It only filters the response to a Directory Enumeration. The access control is still done through NTFS.

Aside from that ABE only works when the access happens through the Server Service (aka the Fileserver). Any access locally to the file system is not affected by ABE. Restated:

“Access-based enumeration does not prevent users from obtaining a referral to a folder target if they already know the DFS path of the folder with targets. Permissions set using Windows Explorer or the Icacls command on namespace roots or folders without targets control whether users can access the DFS folder or namespace root. However, they do not prevent users from directly accessing a folder with targets. Only the share permissions or the NTFS file system permissions of the shared folder itself can prevent users from accessing folder targets.” Recall what I said earlier, “ABE is not a security feature”. TechNet

ABE does not do any caching.

Every requests causes a filter calculation. There is no cache. ABE will repeat the same exact work for identical directory enumerations by the same user.

ABE cannot predict the permissions or the result.

It has to do the calculations for each object in every level of your folder hierarchy every time it is accessed.

If you use inheritance on the folder structure, a user will have the same permission and thus the same filter result from ABE through the entire folder structure. Still ABE as to calculate this result, consuming CPU Cycles in the process.

If you enable ABE on such a folder structure you are just wasting CPU cycles without any gain.

With those basics out of the way, let’s dive into the mechanics behind the scenes:

How the filtering calculation works

  1. When a QUERY_DIRECTORY request (https://msdn.microsoft.com/en-us/library/cc246551.aspx) or its SMB1 equivalent arrives at the server, the server will get a list of objects within that directory from the filesystem.
  2. With ABE enabled, this list is not immediately sent out to the client, but instead passed over to the ABE for processing.
  3. ABE will iterate through EVERY object of this list and compare the permission of the user with the objects ACL.
  4. The objects where the user does not have generic read access are removed from the list.
  5. After ABE has completed its processing, the client receives the filtered list.

This yields two effects:

  • This comparison is an active operation and thus consumes CPU Cycles.
  • This comparison takes time, and this time is passed down to the User as the results will only be sent, when the comparisons for the entire directory are completed.

This brings us directly to the core point of this Blog:

In order to successfully use ABE in your environment you have to manage both effects.

If you don’t, ABE can cause a wide spread outage of your File services.

The first effect can cause a complete saturation of your CPUs (all cores at 100%).

This does not only increase the response times of the Fileserver to its clients to a magnitude where the Server is not accepting any new connections or the clients kill their connection after not getting a response from the server for several minutes, but it can also prevent you from establishing a remote desktop connection to the server to make any changes (like disabling ABE for instance).

The second effect can increase the response times of your fileserver (even if its otherwise Idle) to a magnitude that is not accepted by the Users anymore.

The comparison for a single directory enumeration by a single user can keep one CPU in your server busy for quite some time, thus making it more likely for new incoming requests to overlap with already running ABE calculations. This eventually results in a Backlog adding further to the delays experienced by your clients.

To illustrate this let’s roll some numbers:

A little disclaimer:

The following calculation is what I’ve seen, your results may differ as there are many moving pieces in play here. In other words, your mileage may vary. That aside, the numbers seen here are not entirely off but stem from real production environments. Performance of Disk and CPU and other workloads play into these numbers as well.

Thus the calculation and numbers are for illustration purposes only. Don’t use it to calculate your server’s performance capabilities.

Let’s assume you have a DFS Namespace with 10,000 links that is hosted on DFS Servers that have 4 CPUs with 3.5 GHz (also assuming RSS is configured correctly and all 4 CPUs are used by the File service: https://blogs.technet.microsoft.com/networking/2015/07/24/receive-side-scaling-for-the-file-servers/ ).

We usually expect single digit millisecond response times measured at the fileserver to achieve good performance (network latency obviously adds to the numbers seen on the client).

In our scenario above (10,000 Links, ABE, 3.5 Ghz CPU) it is not unseen that a single enumeration of the namespace would take 500ms.

CPU cores and speed DFS Namespace Links RSS configured per recommendations ABE enabled? Response time
4 @ 3.5 GHz 10,000 Yes No <10ms
4 @ 3.5 GHz 10,000 Yes Yes 300 – 500 ms

That means a single CPU can handle up to 2 Directory Enumerations per Second. Multiplied by 4 CPUs the server can handle 8 User Requests per Second. Any more than those 8 requests and we push the Server into a backlog.

Backlog in this case means new requests are stuck in the Processor Queue behind other requests, therefore multiplying the wait time.

This can reach dimensions where the client (and the user) is waiting for minutes and the client eventually decides to kill the TCP connection, and in case of DFSN, fail over to another server.

Anyone remotely familiar with Fileserver Scalability probably instantly recognizes how bad and frightening those numbers are.  Please keep in mind, that not every request sent to the server is a QUERY_DIRECTORY request, and all other requests such as Write, Read, Open, Close etc. do not cause an ABE calculation (however they suffer from an ABE-induced lack of CPU resources in the same way).

Furthermore, the Windows File Service Client caches the directory enumeration results if SMB2 or SMB3 is used (https://technet.microsoft.com/en-us/library/ff686200(v=ws.10).aspx ).

There is no such Cache for SMB1. Thus SMB1 Clients will send more Directory Enumeration Requests than SMB2 or SMB3 Clients (particularly if you keep the F5 key pressed).

It should now be obvious that you should use SMB2/3 versus SMB1 and ensure you leave the caches enabled if you use ABE on your servers.

As you might have realized by now, there is no easy or reliable way to predict the CPU demand of ABE. If you are developing a completely new environment you usually cannot forecast the proportion of QUERY_DIRECTORY requests in relation to the other requests or the frequency of the same.

Recommendations!

The most important recommendation I can give you is:

Do not enable ABE unless you really need to.

Let’s take the Users Home shares as an example:

Usually there is no user browsing manually through this structure, but instead the users get a mapped drive pointing to their folder. So the usability aspect does not count.  Additionally most users will know (or can find out from the Office Address book) the names or aliases of their colleagues. So there is no sensitive information to hide here.  For ease of management most home shares live in big namespace or server shares, what makes them very unfit to be used with ABE.  In many cases the user has full control (or at least write permissions) inside his own home share.  Why should I waste my CPU Cycles to filter the requests inside someone’s Home Share?

Considering all those points, I would be intrigued to learn about a telling argument to enable ABE on User Home Shares or Roaming Profile Shares.  Please sound off in the comments.

If you have a data structure where you really need to enable ABE, your file service concept needs to facilitate these four requirements:

You need Scalability.

You need the ability to increase the number of CPUs doing the ABE calculations in order to react to increasing numbers (directory sizes, number of clients, usage frequency) and thus performance demand.

The easiest way to achieve this is to do ABE Filtering exclusively in DFS Domain Namespaces and not on the Fileservers.

By that you can add easily more CPUs by just adding further Namespace Servers in the sites where they are required.

Also keep in mind, that you should have some redundancy and that another server might not be able to take the full additional load of a failing server on top of its own load.

You need small chunks

The number of objects that ABE needs to check for each calculation is the single most important factor for the performance requirement.

Instead of having a single big 10,000 link namespace (same applies to directories on file servers) build 10 smaller 1,000 link-namespaces and combine them into a DFS Cascade.

By that ABE just needs to filter 1,000 objects for every request.

Just re-do the example calculation above with 250ms, 100ms, 50ms or even less.

You will notice that you are suddenly able to reach very decent numbers in terms of Requests/per Second.

The other nice side effect is, that you will do less calculations, as the user will usually follow only one branch in the directory tree, and is thus not causing ABE calculations for the other branches.

You need Separation of Workloads.

Having your SQL Server run on the same machine as your ABE Server can cause a lack of Performance for both workloads.

Having ABE run on you Domain Controller exposes your Domain Controller Role to the risk of being starved of CPU Cycles and thus not facilitating Domain Logons anymore.

You need to test and monitor your performance

In many cases you are deploying a new file service concept into an existing environment.

Thus you can get some numbers regarding QUERY_DIRECTORY requests, from the existing DFS / Fileservers.

Build up your Namespace / Shares as you envisioned and use the File Server Capacity Tool (https://msdn.microsoft.com/en-us/library/windows/hardware/dn567658(v=vs.85).aspx ) to simulate the expected load against it.

Monitor the SMB Service Response Times, the Processor utilization and Queue length and the feel on the client while browsing through the structures.

This should give you an idea on how many servers you will need, and if it is required to go for a slimmer design of the data structures.

Keep monitoring those values through the lifecycle of your file server deployment in order to scale up in time.

Any deployment of new software, clients or the normal increase in data structure size could throw off your initial calculations and test results.

This point should imho be outlined very clearly in any concept documentation.

This concludes the first part of this Blog Series.

I hope you found it worthwhile and got an understanding how to successfully design a File service with ABE.

Now to round off your knowledge, or if you need to troubleshoot a Performance Issue on an ABE-enabled Server, I strongly encourage you to read the second part of this Blog Series. This post will be updated as soon as it’s live.

With best regards,

Hubert

Access-Based Enumeration (ABE) Troubleshooting (part 2 of 2)

$
0
0

Hello everyone! Hubert from the German Networking Team here again with part two of my little Blog Post Series about Access-Based Enumeration (ABE). In the first part I covered some of the basic concepts of ABE. In this second part I will focus on monitoring and troubleshooting Access-based enumeration.
We will begin with a quick overview of Windows Explorer’s directory change notification mechanism (Change Notify), and how that mechanism can lead to performance issues before moving on to monitoring your environment for performance issues.

Change Notify and its impact on DFSN servers with ABE

Let’s say you are viewing the contents of a network share while a file or folder is added to the share remotely by someone else. Your view of this share will be updated automatically with the new contents of the share without you having to manually refresh (press F5) your view.
Change Notify is the mechanism that makes this work in all SMB Protocols (1,2 and 3).
The way it works is quite simple:

  1. The client sends a CHANGE_NOTIFY request to the server indicating the directory or file it is interested in. Windows Explorer (as an application on the client) does this by default for the directory that is currently in focus.
  2. Once there is a change to the file or directory in question, the server will respond with a CHANGE_NOTIFY Response, indicating that a change happened.
  3. This causes the client to send a QUERY_DIRECTORY request (in case it was a directory or DFS Namespace) to the server to find out what has changed.

    QUERY_DIRECTORY is the thing we discussed in the first post that causes ABE filter calculations. Recall that it’s these filter calculation that result in CPU load and client-side delays.
    Let’s look at a common scenario:
  4. During login, your users get a mapped drive pointing at a share in a DFS Namespace.
  5. This mapped drive causes the clients to connect to your DFSN Servers
  6. The client sends a Change Notification (even if the user hasn’t tried to open the mapped drive in Windows Explorer yet) for the DFS Root.

    Nothing more happens until there is a change on the server-side. Administrative work, such as adding and removing links, typically happens during business hours, whenever the administrators find the time, or the script that does it, runs.

    Back to our scenario. Let’s have a server-side change to illustrate what happens next:
  7. We add a Link to the DFS Namespace.
  8. Once the DFSN Server picks up the new link in the namespace from Active directory, it will create the corresponding reparse point in its local file system.
    If you do not use Root Scalability Mode (RSM) this will happen almost at the same time on all of the DFS Servers in that namespace. With RSM the changes will usually be applied by the different DFS servers over the next hour (or whatever your SyncInterval is set to).
  9. These changes trigger CHANGE_NOTIFY responses to be sent out to any client that indicated interest in changes to the DFS Root on that server. This usually applies to hundreds of clients per DFS server.
  10. This causes hundreds of Clients to send QUERY_DIRECTORY requests simultaneously.

What happens next strongly depends on the size of your namespace (larger namespaces lead to longer duration per ABE calculation) and the number of Clients (aka Requests) per CPU of the DFSN Server (remember the calculation from the first part?)

As your Server does not have hundreds of CPUs there will definitely be some backlog. The numbers above decide how big this backlog will be, and how long it takes for the server to work its way back to normal. Keep in mind that while pedaling out of the backlog situation, your server still has to answer other, ongoing requests that are unrelated to our Change Notify Event.
Suffice it to say, this backlog and the CPU demand associated with it can also have negative impact to other jobs.  For example, if you use this DFSN server to make a bunch of changes to your namespace, these changes will appear to take forever, simply because the executing server is starved of CPU Cycles. The same holds true if you run other workloads on the same server or want to RDP into the box.

So! What can you do about it?
As is common with an overloaded server, there are a few different approaches you could take:

  • Distribute the load across more servers (and CPU cores)
  • Make changes outside of business hours
  • Disable Change Notify in Windows Explorer

Approach

Method

Distribute the load / scale up

An expensive way to handle the excessive load is to throw more servers/CPU cores into the DFS infrastructure. In theory, you could increase the number of Servers and the number of CPUs to a level where you can handle such peak loads without any issues, but that can be a very expensive approach.

Make changes outside business hours

Depending on your organizations structure, your business needs, SLAs and other requirements, you could simply make planned administrative changes to your Namespaces outside the main business hours, when there are less clients connected to your DFSN Servers.

Disable Change Notify in Windows Explorer

You can set:
NoRemoteChangeNotify
NoRemoteRecursiveEvents
See https://support.microsoft.com/en-us/kb/831129
to prevent Windows Explorer from sending Change Notification Requests.
This is however a client-side setting that disables this functionality (change notify) not just for DFS shares but also for any fileserver it is working with. Thus you have to actively press F5 to see changes to a folder or a share in your Windows Explorer. This might or might not be a big deal for your users.

Monitoring ABE

As you may have realized by now, ABE is not a fire and forget technology—it needs constant oversight and occasional tuning. We’ve mainly discussed the design and “tuning” aspect so far. Let’s look into the monitoring aspect.

Using Task Manager / Process Explorer

This is a bit tricky, unfortunately, as any load caused by ABE shows up in Task Manager inside the System process (as do many other things on the server). In order to correlate high CPU utilization in the System process to ABE load, you need to use a tool such as Process Explorer and configure it to use public symbols. With this configured properly, you can drill deeper inside the System Process and see the different threads and the component names. We need to note, that ABE and the Fileserver both use functions in srv.sys and srv2.sys. So strictly speaking it’s not possible to differentiate between them just by the component names. However, if you are troubleshooting a performance problem on an ABE-enabled server where most of the threads in the System process are sitting in functions from srv.sys and srv2.sys, then it’s very likely due to expensive ABE filter calculations. This is, aside from disabling ABE, the best approach to reliably prove your problem to be caused by ABE.

Using Network trace analysis

Looking at CPU utilization shows us the server-side problem. We must use other measures to determine what the client-side impact is, one approach is to take a network trace and analyze the SMB/SMB2 Service Response times. You may however end up having to capture the trace on a mirrored switch port. To make analysis of this a bit easier, Message Analyzer has an SMB Service Performance chart you can use.

clip_image002

You get there by using a New Viewer, like below.

smbserviceperf

Wireshark also has a feature that provides you with statistics under Statistics -> Service Response Times -> SMB2. Ignore the values for ChangeNotify (its normal that they are several seconds or even minutes). All other response times translate into delays for the clients. If you see values over a second, you can consider your files service not only to be slow but outright broken.
While you have that trace in front of you, you can also look for SMB/TCP Connections that are terminated abnormally by the Client as the server failed to respond to the SMB Requests in time. If you have any of those, then you have clients unable to connect to your file service, likely throwing error messages.

Using Performance Monitor

If your server is running Windows Server 2012 or newer, the following performance counters are available:

Object

Counter

Instance

SMB Server Shares

Avg. sec /Data Request

<Share that has ABE Enabled>

SMB Server Shares

Avg. sec/Read

‘’

SMB Server Shares

Avg. sec/Request

‘’

SMB Server Shares

Avg. sec/Write

‘’

SMB Server Shares

Avg. Data Queue Length

‘’

SMB Server Shares

Avg. Read Queue Length

‘’

SMB Server Shares

Avg. Write Queue Length

‘’

SMB Server Shares

Current Pending Requests

‘’

clip_image005

Most noticeable here is Avg. sec/Request counter as this contains the response time to the QUERY_DIRECTORY requests (Wireshark displays them as Find Requests). The other values will suffer from a lack of CPU Cycles in varying ways but all indicate delays for the clients. As mentioned in the first part: We expect single digit millisecond response times from non-ABE Fileservers that are performing well. For ABE-enabled Servers (more precisely Shares) the values for QUERY_DIRECTORY / Find Requests will always be higher due to the inevitable length of the ABE Calculation.

When you reached a state where all the other SMB Requests aside of the QUERY_DIRECTORY are constantly responded to in less than 10ms and the QUERY_DIRECTORY constantly in less than 50ms you have a very good performing Server with ABE.

Other Symptoms

There are other symptoms of ABE problems that you may observe, however, none of them on their own is very telling, without the information from the points above.

At a first glance a high CPU Utilization and a high Processor Queue lengths are indicators of an ABE problem, however they are also indicators of other CPU-related performance issues. Not to mention there are cases where you encounter ABE performance problems without saturating all your CPUs.

The Server Work Queues\Active Threads (NonBlocking) will usually raise to their maximum allowed limit (MaxThreadsPerQueue ) as well as the Server Work Queues\Queue Length increasing. Both indicate that the Fileserver is busy, but on their own don’t tell you how bad the situation is. However, there are scenarios where the File server will not use up all Worker Threads allowed due to a bottleneck somewhere else such as in the Disk Subsystem or CPU Cycles available to it.

See the following should you choose to setup long-term monitoring (which you should) in order to get some trends:

Number of Objects per Directory or Number of DFS Links
Number of Peak User requests (Performance Counter: Requests / sec.)
Peak Server Response time to Find Requests or Performance Counter: Avg. sec/Request
Peak CPU Utilization and Peak Processor Queue length.

If you collect those values every day (or a shorter interval), you can get a pretty good picture how much head-room you have left with your servers at the moment and if there are trends that you need to react to.

Feel free to add more information to your monitoring to get a better picture of the situation.  For example: gather information on how many DFS servers were active at any given day for a certain site, so you can explain if unusual high numbers of user requests on the other servers come from a server downtime.

ABELevel

Some of you might have heard about the registry key ABELevel. The ABELevel value specifies the maximum level of the folders on which the ABE feature is enabled. While the title of the KB sounds very promising, and the hotfix is presented as a “Resolution”, the hotfix and registry value have very little practical application.  Here’s why:
ABELevel is a system-wide setting and does not differentiate between different shares on the same server. If you host several shares, you are unable to filter to different depths as the setting forces you to go for the deepest folder hierarchy. This results in unnecessary filter calculations for shares.

Usually the widest directories are on the upper levels—those levels that you need to filter.  Disabling the filtering for the lower level directories doesn’t yield much of a performance gain, as those small directories don’t have much impact on server performance, while the big top-level directories do.  Furthermore, the registry value doesn’t make any sense for DFS Namespaces as you have only one folder level there and you should avoid filtering on your fileservers anyway.

While we are talking about Updates

Here is one that you should install:
High CPU usage and performance issues occur when access-based enumeration is enabled in Windows 8.1 or Windows 7 – https://support.microsoft.com/en-us/kb/2920591

Furthermore you should definitely review the Lists of recommended updates for your server components:
DFS
https://support.microsoft.com/en-us/kb/968429 (2008 / 2008 R2)
https://support.microsoft.com/en-us/kb/2951262 (2012 / 2012 R2)

File Services
https://support.microsoft.com/en-us/kb/2473205 (2008 / 2008 R2)
https://support.microsoft.com/en-us/kb/2899011 (2012 / 2012 R2)

Well then, this concludes this small (my first) blog series.
I hope you found reading it worthwhile and got some input for your infrastructures out there.

With best regards
Hubert

Troubleshooting failed password changes after installing MS16-101

$
0
0

Hi!

Linda Taylor here, Senior Escalation Engineer in the Directory Services space.

I have spent the last month working with customers worldwide who experienced password change failures after installing the updates under Ms16-101 security bulletin KB’s (listed below), as well as working with the product group in getting those addressed and documented in the public KB articles under the known issues section. It has been busy!

In this post I will aim to provide you with a quick “cheat sheet” of known issues and needed actions as well as ideas and troubleshooting techniques to get there.

Let’s start by understanding the changes.

The following 6 articles describe the changes in MS16-101 as well as a list of Known issues. If you have not yet applied MS16-101 I would strongly recommend reading these and understanding how they may affect you.

        3176492 Cumulative update for Windows 10: August 9, 2016
        3176493 Cumulative update for Windows 10 Version 1511: August 9, 2016
        3176495 Cumulative update for Windows 10 Version 1607: August 9, 2016
        3178465 MS16-101: Security update for Windows authentication methods: August 9, 2016
        3167679 MS16-101: Description of the security update for Windows authentication methods: August 9, 2016
        3177108 MS16-101: Description of the security update for Windows authentication methods: August 9, 2016

The good news is that this month’s updates address some of the known issues with MS16-101.

The bad news is that not all the issues are caused by some code defect in MS16-101 and in some cases the right solution is to make your environment more secure by ensuring that the password change can happen over Kerberos and does not need to fall back to NTLM. That may include opening TCP ports used by Kerberos, fixing other Kerberos problems like missing SPN’s or changing your application code to pass in a valid domain name.

Let’s start with the basics…

Symptoms:

After applying MS16-101 fixes listed above, password changes may fail with the error code

“The system detected a possible attempt to compromise security. Please make sure that you can contact the server that authenticated you.”
Or
“The system cannot contact a domain controller to service the authentication request. Please try again later.”

This text maps to the error codes below:

Hexadecimal

Decimal

Symbolic

Friendly

0xc0000388

1073740920

STATUS_DOWNGRADE_DETECTED

The system detected a possible attempt to compromise security. Please make sure that you can contact the server that authenticated you.

0x80074f1

1265

ERROR_DOWNGRADE_DETECTED

The system detected a possible attempt to compromise security. Please make sure that you can contact the server that authenticated you.

Question: What does MS16-101 do and why would password changes fail after installing it?

Answer: As documented in the listed KB articles, the security updates that are provided in MS16-101 disable the ability of the Microsoft Negotiate SSP to fall back to NTLM for password change operations in the case where Kerberos fails with the STATUS_NO_LOGON_SERVERS (0xc000005e) error code.
In this situation, the password change will now fail (post MS16-101) with the above mentioned error codes (ERROR_DOWNGRADE_DETECTED / STATUS_DOWNGRADE_DETECTED).
Important: Password RESET is not affected by MS16-101 at all in any scenario. Only password change using the Negotiate package is affected.

So, now you understand the change, let’s look at the known issues and learn how to best identify and resolve those.

Summary and Cheat Sheet

To make it easier to follow I have matched the ordering of known issues in this post with the public KB articles above.

First, when troubleshooting a failed password change post MS16-101 you will need to understand HOW and WHERE the password change is happening and if it is for a domain account or a local account. Here is a cheat sheet.

Summary of SCENARIO’s and a quick reference table of actions needed.

Scenario / Known issue #

Description

Action Needed

1.

Domain password change fails via CTRL+ALT+DEL and shows an error like this:

clip_image001

Text: “System detected a possible attempt to compromise security. Please ensure that you can contact the server that authenticated you. “

Troubleshoot using this guide and fix Kerberos.

2.

Domain password change fails via application code with an INCORRECT/UNEXPECTED Error code when a password which does not meet password complexity is entered.

For example, before installing MS16-101, such password change may have returned a status like STATUS_PASSWORD_RESTRICTION and it now returns STATUS_DOWNGRADE_DETECTED (after installing Ms16-101) causing your application to behave in an expected way or even crash.

Note: In these cases password change works ok when correct new password is entered that complies with the password policy.

Install October fixes in the table below.

3.

Local user account password change fails via CTRL+ALT+DEL or application code.

Install October fixes in the table below.

4.

Passwords for disabled and locked out user accounts cannot be changed using Negotiate method.

None. By design.

5.

Domain password change fails via application code when a good password is entered.

This is the case where if you pass a servername to NetUserChangePassword, the password change will fail post MS16-101. This is because it would have previously worked and relied on NTLM. NTLM is insecure and Kerberos is always preferred. Therefore passing a domain name here is the way forward.

One thing to note for this one is that most of the ADSI and C#/.NET changePassword API’s end up calling NetUserChangePassword under the hood. Therefore, also passing invalid domain names to these API’s will fail. I have provided a detailed walkthrough example in this post with log snippets.

Troubleshoot using this guide and fix code to use Kerberos.

6.

After you install MS 16-101 update, you may encounter 0xC0000022 NTLM authentication errors.

To resolve this issue, see KB3195799 NTLM authentication fails with 0xC0000022 error for Windows Server 2012, Windows 8.1, and Windows Server 2012 R2 after update is applied.

7.

After you install the security updates that are described in MS16-101, remote, programmatic changes of a local user account password remotely, and password changes across untrusted forest fail with the STATUS_DOWNGRADE_DETECTED error as documented in this post.

This happens because the operation relies on NTLM fall-back since there is no Kerberos without a trust.  NTLM fall-back is forbidden by MS16-101.

For this scenario you will need to install October fixes in the table below and set the registry key NegoAllowNtlmPwdChangeFallback documented in KB’s below which allows the NTLM fall back to happen again and unblocks this scenario.

http://support.microsoft.com/kb/3178465
http://support.microsoft.com/kb/3167679
http://support.microsoft.com/kb/3177108
http://support.microsoft.com/kb/3176492
http://support.microsoft.com/kb/3176495
http://support.microsoft.com/kb/3176493

Note: you may also consider using this registry key in an emergency for Known Issue#5 when it takes time to update the application code. However please read the above articles carefully and only consider this as a short term solution for scenario 5.


Table of Fixes for known issues above release 2016.10.11, taken from MS16-101 Security Bulletin:

OS

Fix needed

Vista / W2K8

Re-install 3167679, re-released 2016.10.11

Win7 / W2K8 R2

Install 3192391 (security only)
or
Install 3185330 (monthly rollup that includes security fixes)

WS12

3192393 (security only)
or
3185332 (monthly rollup that includes security fixes)

Win8.1 / WS12 R2

3192392 (security only)
OR
3185331 ((monthly rollup that includes security fixes)

Windows 10

For 1511: 3192441 Cumulative update for Windows 10 Version 1511: October 11, 2016
For 1607: 3194798 Cumulative update for Windows 10 Version 1607 and Windows Server 2016: October 11, 2016

Troubleshooting

As I mentioned, this post is intended to support the documentation of the known issues in the Ms16-101 KB articles and provide help and guidance for troubleshooting. It should help you identify which known issue you are experiencing as well as provide resolution suggestions for each case.

I have also included a troubleshooting walkthrough of some of the more complex example cases. We will start with the problem definition, and then look at the available logs and tools to identify a suitable resolution. The idea is to teach “how to fish” because there can be many different scenario’s and hopefully you can apply these techniques and use the log files documented here to help resolve the issues when needed.

Once you know the scenario that you are using for the password change the next step is usually to collect some data on the server or client where the password change is occuring. For example if you have a web server running a password change application and doing password changes on behalf of users, you will need to collect the logs there. If in doubt collect the logs from all involved machines and then look for the right one doing the password change using the snippets in the examples. Here are the helpful logs.

DATA COLLECTION

The same logs will help in all the scenario’s.

LOGS

1. SPENGO debug log/ LSASS.log

To enable this log run the following commands from an elevated admin CMD prompt to set the below registry keys:

reg add HKLM\SYSTEM\CurrentControlSet\Control\LSA /v SPMInfoLevel /t REG_DWORD /d 0xC03E3F /f
reg add HKLM\SYSTEM\CurrentControlSet\Control\LSA /v LogToFile /t REG_DWORD /d 1 /f
reg add HKLM\SYSTEM\CurrentControlSet\Control\LSA /v NegEventMask /t REG_DWORD /d 0xF /f


  • This will log Negotiate debug output to the %windir%\system32\lsass.log.
  • There is no need for reboot. The log is effective immediately.
  • Lsass.log is a text file that is easy to read with a text editor such as Wordpad.

2. Netlogon.log:

This log has been around for many years and is useful for troubleshooting DC LOCATOR traffic. It can be used together with a network trace to understand why the STATUS_NO_LOGON_SERVERS is being returned for the Kerberos password change attempt.

· To enable Netlogon debug logging run the following command from an elevated CMD prompt:

            nltest /dbflag:0x26FFFFFF

· The resulting log is found in %windir%\debug\netlogon.log & netlogon.bak

· There is no need for reboot. The log is effective immediately. See also 109626 Enabling debug logging for the Net Logon service

· The Netlogon.log (and Netlogon.bak) is a text file.

           Open the log with any text editor (I like good old Notepad.exe)

3. Collect a Network trace during the password change issue using the tool of your choice.

Scenario’s, Explanations and Walkthrough’s:

When reading this you should keep in mind that you may be seeing more than one scenario. The best thing to do is to start with one, fix that and see if there are any other problems left.

1. Domain password change fails via CTRL+ALT+DEL

This is most likely a Kerberos DC locator failure of some kind where the password changes were relying on NTLM before installing MS16-101 and are now failing. This is the simplest and easiest case to resolve using basic Kerberos troubleshooting methods.

Solution: Fix Kerberos.

Some tips from cases which we saw:

1. Use the Network trace to identify if the necessary communication ports are open. This was quite a common issue. So start by checking this.

         In order for Kerberos password changes to work communication on TCP port 464 needs to be open between the client doing the
         password change and the domain controller.

Note on RODC: Read-only domain controllers (RODCs) can service password changes if the user is allowed by the RODCs password replication policy. Users who are not allowed by the RODC password policy require network connectivity to a read/write domain controller (RWDC) in the user account domain to be able to change the password.

           To check whether TCP port 464 is open, follow these steps (also documented in KB3167679):

             a. Create an equivalent display filter for your network monitor parser. For example:

                            ipv4.address== <ip address of client> && tcp.port==464

             b. In the results, look for the “TCP:[SynReTransmit” frame.

If you find these, then investigate firewall and open ports. It is often useful to take a simultaneous trace from the client and the domain controller and check if the packets are arriving at the other end.

2. Make sure that the target Kerberos names are valid.

  • IP addresses are not valid Kerberos names
  • Kerberos supports short names and fully qualified domain names. Like CONTOSO or Contoso.com

3. Make sure that service principal names (SPNs) are registered correctly.

For more information on troubleshooting Kerberos see https://blogs.technet.microsoft.com/askds/2008/05/14/troubleshooting-kerberos-authentication-problems-name-resolution-issues/ or https://technet.microsoft.com/en-us/library/cc728430(v=ws.10).aspx

2. Domain password change fails via application code with an INCORRECT/UNEXPECTED Error code when a password which does not meet password complexity is entered.

For example, before installing MS16-101, such password change may have returned a status like STATUS_PASSWORD_RESTRICTION. After installing Ms16-101 it returns STATUS_DOWNGRADE_DETECTED causing your application to behave in an expected way or even crash.

Note: In this scenario, password change succeeds when correct new password is entered that complies with the password policy.

Cause:

This issue is caused by a code defect in ADSI whereby the status returned from Kerberos was not returned to the user by ADSI correctly.
Here is a more detailed explanation of this one for the geek in you:

Before MS16-101 behavior:

           1. An application calls ChangePassword method from using the ADSI LDAP provider.
           Setting and changing passwords with the ADSI LDAP Provider is documented here.
           Under the hood this calls Negotiate/Kerberos to change the password using a valid realm name.
           Kerberos returns STATUS_PASSWORD_RESTRICTION or Other failure code.

          2. A 2nd changepassword call is made via NetUserChangePassword API with an intentional realmname as the <dcname> which uses
           Negotiate and will retry Kerberos. Kerberos fails with STATUS_NO_LOGON_SERVERS because a DC name is not a valid realm name.

         3.Negotiate then retries over NTLM which succeeds or returns the same previous failure status.

The password change fails if a bad password was entered and the NTLM error code is returned back to the application. If a valid password was entered, everything works because the 1st change password call passes in a good name and if Kerberos works, the password change operation succeeds and you never enter into step 3.

Post MS16-101 behavior /why it fails with MS16-101 installed:

         1. An application calls ChangePassword method from using the ADSI LDAP provider. This calls Negotiate for the password change with
          a valid realm name.
         Kerberos returns STATUS_PASSWORD_RESTRICTION or Other failure code.

         2. A 2nd ChangePassword call is made via NetUserChangePassword with a <dcname> as realm name which fails over Kerberos with
         STATUS_NO_LOGON_SERVERS which triggers NTLM fallback.

          3. Because NTLM fallback is blocked on MS16-101, Error STATUS_DOWNGRADE_DETECTED is returned to the calling app.

Solution: Easy. Install the October update which will fix this issue. The fix lies in adsmsext.dll included in the October updates.

Again, here are the updates you need to install, Taken from MS16-101 Security Bulletin:

OS

Fix needed

Vista / W2K8

Re-install 3167679, re-released 2016.10.11

Win7 / W2K8 R2

Install 3192391 (security only)
or
Install 3185330 (monthly rollup that includes security fixes)

WS12

3192393 (security only)
or
3185332 (monthly rollup that includes security fixes)

Win8.1 / WS12 R2

3192392 (security only)
OR
3185331 ((monthly rollup that includes security fixes)

Windows 10

For 1511: 3192441 Cumulative update for Windows 10 Version 1511: October 11, 2016
For 1607: 3194798 Cumulative update for Windows 10 Version 1607 and Windows Server 2016: October 11, 2016

3.Local user account password change fails via CTRL+ALT+DEL or application code.

Installing October updates above should also resolve this.

MS16-101 had a defect where Negotiate did not correctly determine that the password change was local and would try to find a DC using the local machine as the domain name.

This failed and NTLM fallback was no longer allowed post MS16-101. Therefore, the password changes failed with STATUS_DOWNGRADE_DETECTED.

Example:

One such scenario which I saw where password changes of local user accounts via ctrl+alt+delete failed with the message “The system detected a possible attempt to compromise security. Please ensure that you can contact the server that authenticated you.” Was when you have the following group policy set and you try to change a password of a local account:

Policy

Computer Configuration \ Administrative Templates \ System \ Logon\“Assign a default domain for logon”

Path

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System\DefaultLogonDomain

Setting

DefaultLogonDomain

Data Type

REG_SZ

Value

“.”    (less quotes). The period or “dot” designates the local machine name

Notes

Cause: In this case, post MS16-101 Negotiate incorrectly determined that the account is not local and tried to discover a DC using \\<machinename> as the domain and failed. This caused the password change to fail with the STATUS_DOWNGRADE_DETECTED error.

Solution: Install October fixes listed in the table at the top of this post.

4.Passwords for disabled and locked out user accounts cannot be changed using Negotiate method.

MS16-101 purposely disabled changing the password of locked-out or disabled user account passwords via Negotiate by design.

Important: Password Reset is not affected by MS16-101 at all in any scenario. Only password change. Therefore, any application which is doing a password Reset will be unaffected by Ms16-101.

Another important thing to note is that MS16-101 only affects applications using Negotiate. Therefore, it is possible to change locked-out and disabled account password using other method’s such as LDAPs.

For example, the PowerShell cmdlet Set-ADAccountPassword will continue to work for locked out and disabled account password changes as it does not use Negotiate.

5. Troubleshooting Domain password change failure via application code when a good password is entered.

This is one of the most difficult scenarios to identify and troubleshoot. And therefore I have provided a more detailed example here including sample code, the cause and solution.

In summary, the solution for these cases is almost always to correct the application code which maybe passing in an invalid domain name such that Kerberos fails with STATUS_NO_LOGON_SERVERS.

Scenario:

An application is using system.directoryservices.accountmanagement namespace to change a users password.
https://msdn.microsoft.com/en-us/library/system.directoryservices.accountmanagement(v=vs.110).aspx

After installing Ms16-101 password changes fail with STATUS_DOWNGRADE_DETECTED. Example .NET failing code snippet using PowerShell which worked before MS16-101:

<snip>

Add-Type -AssemblyName System.DirectoryServices.AccountManagement
$ct = [System.DirectoryServices.AccountManagement.ContextType]::Domain
$ctoptions = [System.DirectoryServices.AccountManagement.ContextOptions]::SimpleBind -bor [System.DirectoryServices.AccountManagement.ContextOptions]::ServerBind
$pc = New-Object System.DirectoryServices.AccountManagement.PrincipalContext($ct, “contoso.com”,”OU=Accounts,DC=Contoso,DC=Com”, ,$ctoptions)
$idType = [System.DirectoryServices.AccountManagement.IdentityType]::SamAccountName  
$up = [System.DirectoryServices.AccountManagement.UserPrincipal]::FindByIdentity($pc,$idType, “TestUser”)
$up.ChangePassword(“oldPassword!123”, “newPassword!123”)

<snip>

Data Analysis

There are 2 possibilities here:
(a) The application code is passing an incorrect domain name parameter causing Kerberos password change to fail to locate a DC.
(b)  Application code is good and Kerberos password change fails for other reason like blocked port or DNS issue or missing SPN.

Let’s start with (a) The application code is passing an incorrect domain name/parameter causing Kerberos password change to fail to locate a DC.

(a) Data Analysis Walkthrough Example based on a real case:

1. Start with Lsass.log (SPNEGO trace)

If you are troubleshooting a password change failure after MS16-101 look for the following text in Lsass.log to indicate that Kerberos failed and NTLM fallback was forbidden by Ms16-101:

Failing Example:

[ 9/13 10:23:36] 492.2448> SPM-WAPI: [11b0.1014] Dispatching API (Message 0)
[ 9/13 10:23:36] 492.2448> SPM-Trace: [11b0] LpcDispatch: dispatching ChangeAccountPassword (1a)
[ 9/13 10:23:36] 492.2448> SPM-Trace: [11b0] LpcChangeAccountPassword()
[ 9/13 10:23:36] 492.2448> SPM-Helpers: [11b0] LsapCopyFromClient(0000005EAB78C9D8, 000000DA664CE5E0, 16) = 0
[ 9/13 10:23:36] 492.2448> SPM-Neg: NegChangeAccountPassword:
[ 9/13 10:23:36] 492.2448> SPM-Neg: NegChangeAccountPassword, attempting: NegoExtender
[ 9/13 10:23:36] 492.2448> SPM-Neg: NegChangeAccountPassword, attempting: Kerberos
[ 9/13 10:23:36] 492.2448> SPM-Warning: Failed to change password for account Test: 0xc000005e
[ 9/13 10:23:36] 492.2448> SPM-Neg: NegChangeAccountPassword, attempting: NTLM
[ 9/13 10:23:36] 492.2448> SPM-Neg: NegChangeAccountPassword, NTLM failed: not allowed to change domain passwords
[ 9/13 10:23:36] 492.2448> SPM-Neg: NegChangeAccountPassword, returning: 0xc0000388

  • 0xc000005E is STATUS_NO_LOGON_SERVERS
    0xc0000388 is STATUS_DOWNGRADE_DETECTED

If you see this, it means Kerberos failed to locate a Domain Controller in the domain and fallback to NTLM is not allowed by Ms16-101. Next you should look at the Netlogon.log and the Network trace to understand why.

2. Network trace

Look at the network trace and filter the traffic based on the client IP, DNS and any authentication related traffic.
You may see the client is requesting a Kerberos ticket using an invalid SPN like:


Source

Destination

Description

Client

DC1

KerberosV5:TGS Request Realm: CONTOSO.COM Sname: ldap/contoso.com             {TCP:45, IPv4:7}

DC1

Client

KerberosV5:KRB_ERROR  – KDC_ERR_S_PRINCIPAL_UNKNOWN (7)  {TCP:45, IPv4:7}

So here the client tried to get a ticket for this ldap\Contoso.com SPN and failed with KDC_ERR_S_PRINCIPAL_UNKNOWN because this SPN is not registered anywhere.

  • This is expected. A valid LDAP SPN is example like ldap\DC1.contoso.com

Next let’s check the Netlogon.log

3. Netlogon.log:

Open the log with any text editor (I like good old Notepad.exe) and check the following:

  • Is a valid domain name being passed to DC locator?

Invalid names such as \\servername.contoso.com or IP address \\x.y.x.w will cause dclocator to fail and thus Kerberos password change to return STATUS_NO_LOGON_SERVERS. Once that happens NTLM fall back is not allowed and you get a failed password change.

If you find this issue examine the application code and make necessary changes to ensure correct domain name format is being passed to the ChangePassword API that is being used.

Example of failure in Netlogon.log:

[MISC] [PID] DsGetDcName function called: client PID=1234, Dom:\\contoso.com Acct:(null) Flags: IP KDC
[MISC] [PID] DsGetDcName function returns 1212 (client PID=1234): Dom:\\contoso.com Acct:(null) Flags: IP KDC

\\contoso.com is not a valid domain name. (contoso.com is a valid domain name)

This Error translates to:

0x4bc

1212

ERROR_INVALID_DOMAINNAME

The format of the specified domain name is invalid.

winerror.h

So what happened here?

The application code passed an invalid TargetName to kerberos. It used the domain name as a server name and so we see the SPN of LDAP\contoso.com.

The client tried to get a ticket for this SPN and failed with KDC_ERR_S_PRINCIPAL_UNKNOWN because this SPN is not registered anywhere. As Noted: this is expected. A valid LDAP SPN is example like ldap\DC1.contoso.com.

The application code then tried the password change again and passed in \\contoso.com as a domain name for the password change. Anything beginning with \\ as domain name is not valid. IP address is not valid. So DCLOCATOR will fail to locate a DC when given this domain name. We can see this in the Netlogon.log and the Network trace.

Conclusion and Solution

If the domain name is invalid here, examine the code snippet which is doing the password change to understand why the wrong name is passed in.

The fix in these cases will be to change the code to ensure a valid domain name is passed to Kerberos to allow the password change to successfully happen over Kerberos and not NTLM. NTLM is not secure. If Kerberos is possible, it should be the protocol used.

SOLUTION

The solution here was to remove “ContextOptions.ServerBind |  ContextOptions.SimpleBind ” and allow the code to use the default (Negotiate). Note, because we were using a domain context but ServerBind this caused the issue. Negotiate with Domain context is the option that works and is successfully able to use kerberos.

Working code:

<snip>
Add-Type -AssemblyName System.DirectoryServices.AccountManagement
$ct = [System.DirectoryServices.AccountManagement.ContextType]::Domain
$pc = New-Object System.DirectoryServices.AccountManagement.PrincipalContext($ct, “contoso.com”,”OU=Accounts,DC=Contoso,DC=Com”)
$idType = [System.DirectoryServices.AccountManagement.IdentityType]::SamAccountName  
$up = [System.DirectoryServices.AccountManagement.UserPrincipal]::FindByIdentity($pc,$idType, “TestUser”)
$up.ChangePassword(“oldPassword!123”, “newPassword!123”)

<snip>

Why does this code work before MS16-101 and fail after?

ContextOptions are documented here: https://msdn.microsoft.com/en-us/library/system.directoryservices.accountmanagement.contextoptions(v=vs.110).aspx

Specifically: “This parameter specifies the options that are used for binding to the server. The application can set multiple options that are linked with a bitwise OR operation. “

Passing in a domain name such as contoso.com with the ContextOptions ServerBind or SimpleBind causes the client to attempt to use an SPN like ldap\contoso.com because it expects the name which is passed in to be a ServerName.

This is not a valid SPN and does not exist, therefore this will fail and as a result Kerberos will fail with STATUS_NO_LOGON_SERVERS.
Before MS16-101, in this scenario, the Negotiate package would fall back to NTLM, attempt the password change using NTLM and succeed.
Post MS16-101 this fall back is not allowed and Kerberos is enforced.

(b) If Application Code is good but Kerberos fails to locate a DC for other reason

If you see a correct domain name and SPN’s in the above logs, then the issue is that kerberos fails for some other reason such as blocked TCP ports. In this case revert to Scenario 1 to troubleshoot why Kerberos failed to locate a Domain Controller.

There is a chance that you may also have both (a) and (b). Traces and logs are the best tools to identify.

Scenario6: After you install MS 16-101 update, you may encounter 0xC0000022 NTLM authentication errors.

I will not go into detail of this scenario as it is well described in the KB article KB3195799 NTLM authentication fails with 0xC0000022 error for Windows Server 2012, Windows 8.1, and Windows Server 2012 R2 after update is applied.

That’s all for today! I hope you find this useful. I will update this post if any new information arises.

Linda Taylor | Senior Escalation Engineer | Windows Directory Services
(A well established member of the content police.)

Viewing all 67 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>