Register-PSSessionConfiguration causes WinRM service hanging in state 'stopping'
Hi, I use DSC to deploy JEA configuration on many Windows Server 2012 R2 hosts:
PS > $psversiontable
Name Value
---- -----
PSVersion 5.1.14409.1012
PSEdition Desktop
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0...}
BuildVersion 10.0.14409.1012
CLRVersion 4.0.30319.42000
WSManStackVersion 3.0
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1
About 3 times out of 4, when Register-PSSessionConfiguration is triggered by the DSC module, WinRM service is restarted but hangs on Stopping.
It seems to happen more frequently when the configuration causes WinRM to change Logon As (from Network Service to Local System).
Is there a 'correct' way to avoid this behaviour ?
We use the following script to force restart WinRM service (with SCCM as we lost PS remoting ability on host):
$winRMService = Get-Service -Name 'WinRM'
if ($winRMService -and $winRMService.Status -eq 'StopPending') {
$processId = Get-CimInstance -ClassName 'Win32_Service' -Filter "Name LIKE 'WinRM'" | Select-Object -Expand 'ProcessId'
$serviceList = Get-CimInstance -ClassName 'Win32_Service' -Filter "ProcessId=$processId" | Select-Object -Expand 'Name'
$failure = @()
Write-Host "Forcing process $processId to stop ..." -NoNewline
try {
Stop-Process -Id $processId -Force
Write-Host ' done'
Write-Host 'Waiting 5 seconds'
Start-Sleep -Seconds 5
foreach ($service in $serviceList) {
Write-Host "Starting service $service ..." -NoNewline
try {
Start-Service -Name $service
Write-Host ' done'
} catch {
Write-Host ' failed'
$failure += "Start service $service"
}
}
} catch {
Write-Host ' failed'
$failure += "Kill WinRM process"
}
if ($failure) {
Throw "Failed to execute following operation(s): $($failure -join ', ')"
}
}
Should-we add WinRM restart problem detection/mitigation directly in the DSC resource ? I can provide a PR for that (with a less verbose code ;-))
@PaulHigin: are you aware of problems in Register-PSSessionConfiguration/WinRM that could explain this behavior (and issue #31 ) ?
/cc @manojampalam for the WinRM aspects
Thanks for reporting these issues, Julien. I've seen this behavior a few times, but nowhere near as frequently or consistently as you are describing. Have you seen this behavior on 2008 R2 / 2012 / 2016 as well, or just 2012 R2?
For now, I only deployed my configuration on 85 hosts, but only on Windows 2012 R2. I'll have some targets on 2008 R2 soon, but only a few. I plan to deploy on 2016 soon too... So, only tested on 2012 R2.
This bug repeated consistently on 2016, I worked around it by putting the call to Register-PSSessionConfiguration within a PSJob, waiting for 10 seconds, setting $global:DSCMachineStatus = 1 at the end of the set block.
I will try and make my code a bit cleverer and then post it
@djwork : what about calling Register-PSSessionConfiguration within a PSJob and entering a loop until the job is OK or a timeout of, say 30 seconds, is expired. While in the loop, if the service is 'stoping' for more than, say 5 seconds, we run the 'force restart' script I mentioned above ?
It would be quite safe as the WinRM isn't left hanging, other services running with the same process are restarted as well and the resource would be compliant at first run.
Of course, patching WinRM to avoid hanging would be the best solution ;-)
@jnury That's what I did
[DscResource()] class JeaEndpoint { ## The mandatory endpoint name. Use 'Microsoft.PowerShell' by default. [DscProperty(Key)] [string] $EndpointName = 'Microsoft.PowerShell'
## The mandatory role definition map to be used for the endpoint. This
## should be a string that represents the Hashtable used for the RoleDefinitions
## property in New-PSSessionConfigurationFile, such as:
## RoleDefinitions = '@{ Everyone = @{ RoleCapabilities = "BaseJeaCapabilities" } }'
[Dscproperty(Mandatory)]
[string] $RoleDefinitions
## The optional groups to be used when the endpoint is configured to
## run as a Virtual Account
[DscProperty()]
[string[]] $RunAsVirtualAccountGroups
## The optional Group Managed Service Account (GMSA) to use for this
## endpoint. If configured, will disable the default behaviour of
## running as a Virtual Account
[DscProperty()]
[string] $GroupManagedServiceAccount
## The optional directory for transcripts to be saved to
[DscProperty()]
[string] $TranscriptDirectory
## The optional startup script for the endpoint
[DscProperty()]
[string[]] $ScriptsToProcess
## The optional switch to enable mounting of a restricted user drive
[Dscproperty()]
[bool] $MountUserDrive
## The optional size of the user drive. The default is 50MB.
[Dscproperty()]
[long] $UserDriveMaximumSize
## The optional number of seconds to wait for registering the endpoint to complete.
## The default is 10 seconds.
[Dscproperty()]
[int] $HungRegistrationTimeout = 10
## The optional number of times to retry starting the WinRM service.
## The default is 10.
[Dscproperty()]
[int] $MaximumWinRMStartRetry = 10
## The optional expression declaring which domain groups (for example,
## two-factor authenticated users) connected users must be members of. This
## should be a string that represents the Hashtable used for the RequiredGroups
## property in New-PSSessionConfigurationFile, such as:
## RequiredGroups = '@{ And = "RequiredGroup1", @{ Or = "OptionalGroup1", "OptionalGroup2" } }'
[Dscproperty()]
[string] $RequiredGroups
## Applies the JEA configuration
[void] Set()
{
$psscPath = Join-Path ([IO.Path]::GetTempPath()) ([IO.Path]::GetRandomFileName() + ".pssc")
## Convert the RoleDefinitions string to the actual Hashtable
$roleDefinitionsHash = $this.ConvertStringToHashtable($this.RoleDefinitions)
$configurationFileArguments = @{
Path = $psscPath
RoleDefinitions = $roleDefinitionsHash
SessionType = 'RestrictedRemoteServer'
}
if($this.RunAsVirtualAccountGroups -and $this.GroupManagedServiceAccount)
{
throw "The RunAsVirtualAccountGroups setting can not be used when a configuration is set to run as a Group Managed Service Account"
}
## Set up the JEA identity
if($this.RunAsVirtualAccountGroups)
{
$configurationFileArguments["RunAsVirtualAccount"] = $true
$configurationFileArguments["RunAsVirtualAccountGroups"] = $this.RunAsVirtualAccountGroups
}
elseif($this.GroupManagedServiceAccount)
{
$configurationFileArguments["GroupManagedServiceAccount"] = $this.GroupManagedServiceAccount -replace '\$$', ''
}
else
{
$configurationFileArguments["RunAsVirtualAccount"] = $true
}
## Transcripts
if($this.TranscriptDirectory)
{
$configurationFileArguments["TranscriptDirectory"] = $this.TranscriptDirectory
}
## Startup scripts
if($this.ScriptsToProcess)
{
$configurationFileArguments["ScriptsToProcess"] = $this.ScriptsToProcess
}
## Mount user drive
if($this.MountUserDrive)
{
$configurationFileArguments["MountUserDrive"] = $this.MountUserDrive
}
## User drive maximum size
if($this.UserDriveMaximumSize)
{
$configurationFileArguments["UserDriveMaximumSize"] = $this.UserDriveMaximumSize
$configurationFileArguments["MountUserDrive"] = $true
}
## Required groups
if($this.RequiredGroups)
{
## Convert the RequiredGroups string to the actual Hashtable
$requiredGroupsHash = $this.ConvertStringToHashtable($this.RequiredGroups)
$configurationFileArguments["RequiredGroups"] = $requiredGroupsHash
}
## Register the endpoint
try
{
## If we are replacing Microsoft.PowerShell, create a 'break the glass' endpoint
if($this.EndpointName -eq "Microsoft.PowerShell")
{
$breakTheGlassName = "Microsoft.PowerShell.Restricted"
if(-not (Get-PSSessionConfiguration -Name ($breakTheGlassName + "*") |
Where-Object Name -eq $breakTheGlassName))
{
Register-PSSessionConfiguration -Name $breakTheGlassName
}
}
## Remove the previous one, if any.
$existingConfiguration = Get-PSSessionConfiguration -Name ($this.EndpointName + "*") |
Where-Object Name -eq $this.EndpointName
if($existingConfiguration)
{
Unregister-PSSessionConfiguration -Name $this.EndpointName
}
## Create the configuration file
New-PSSessionConfigurationFile @configurationFileArguments
#Register-PSSessionConfiguration has been hanging because the WinRM service is stuck in Stopping state
#therefore we need to run Register-PSSessionConfiguration within a job to allow us to handle a hanging WinRM service
Start-Job -ScriptBlock {
param($endpointName, $psscPath)
Register-PSSessionConfiguration -Name $endpointName -Path $psscPath -Force -ErrorAction Stop
} -ArgumentList ($this.EndpointName), $psscPath | Wait-Job -Timeout ($this.HungRegistrationTimeout) | Remove-Job -Force -ErrorAction SilentlyContinue
#Note: above I used the "ArgumentList" rather than "$using:" because I don't know if "$using:this.EndpointName" will work
#if WinRM is stilling Stopping after the job has completed / exceeded $this.HungRegistrationTimeout, force kill the underlying WinRM process
if ((Get-Service -Name WinRM).Status -ieq 'Stopping') {
$id = Get-WmiObject -Class Win32_Service -Filter "Name LIKE 'WinRM'" | Select-Object -ExpandProperty ProcessId
Stop-Process -Id $id -Force
}
#if stopped try to start WinRM, with $this.MaximumWinRMStartRetry reties
[int]$tryCount = 0
while (((Get-Service -Name WinRM).Status -ieq 'Stopped') -and ($tryCount -le $this.MaximumWinRMStartRetry))
{
Write-Verbose -Message 'Starting WinRM service'
Start-Service -Name WinRM
Start-Sleep -Seconds 1
}
## Enable PowerShell logging on the system
$basePath = "HKLM:\Software\Policies\Microsoft\Windows\PowerShell\ScriptBlockLogging"
if(-not (Test-Path $basePath))
{
$null = New-Item $basePath -Force
}
Set-ItemProperty $basePath -Name EnableScriptBlockLogging -Value "1"
}
finally
{
Remove-Item $psscPath
}
}
# Tests if the resource is in the desired state.
[bool] Test()
{
$currentInstance = $this.Get()
## If this was configured with our mandatory property (RoleDefinitions), dig deeper
if($currentInstance.RoleDefinitions)
{
if($currentInstance.EndpointName -ne $this.EndpointName)
{
Write-Verbose "EndpointName not equal: $($currentInstance.EndpointName)"
return $false
}
## Convert the RoleDefinitions string to the actual Hashtable
$roleDefinitionsHash = $this.ConvertStringToHashtable($this.RoleDefinitions)
Write-Verbose ($currentInstance.RoleDefinitions.GetType())
if(-not $this.ComplexObjectsEqual($this.ConvertStringToHashtable($currentInstance.RoleDefinitions), $roleDefinitionsHash))
{
Write-Verbose "RoleDfinitions not equal: $($currentInstance.RoleDefinitions)"
return $false
}
if(-not $this.ComplexObjectsEqual($currentInstance.RunAsVirtualAccountGroups, $this.RunAsVirtualAccountGroups))
{
Write-Verbose "RunAsVirtualAccountGroups not equal: $(ConvertTo-Json $currentInstance.RunAsVirtualAccountGroups -Depth 100)"
return $false
}
if($currentInstance.GroupManagedServiceAccount -ne ($this.GroupManagedServiceAccount -replace '\$$', ''))
{
Write-Verbose "GroupManagedServiceAccount not equal: $($currentInstance.GroupManagedServiceAccount)"
return $false
}
if($currentInstance.TranscriptDirectory -ne $this.TranscriptDirectory)
{
Write-Verbose "TranscriptDirectory not equal: $($currentInstance.TranscriptDirectory)"
return $false
}
if(-not $this.ComplexObjectsEqual($currentInstance.ScriptsToProcess, $this.ScriptsToProcess))
{
Write-Verbose "ScriptsToProcess not equal: $(ConvertTo-Json $currentInstance.ScriptsToProcess -Depth 100)"
return $false
}
if($currentInstance.MountUserDrive -ne $this.MountUserDrive)
{
Write-Verbose "MountUserDrive not equal: $($currentInstance.MountUserDrive)"
return $false
}
if($currentInstance.UserDriveMaximumSize -ne $this.UserDriveMaximumSize)
{
Write-Verbose "UserDriveMaximumSize not equal: $($currentInstance.UserDriveMaximumSize)"
return $false
}
# Check for null required groups
$requiredGroupsHash = $this.ConvertStringToHashtable($this.RequiredGroups)
if(-not $this.ComplexObjectsEqual($this.ConvertStringToHashtable($currentInstance.RequiredGroups), $requiredGroupsHash))
{
Write-Verbose "RequiredGroups not equal: $(ConvertTo-Json $currentInstance.RequiredGroups -Depth 100)"
return $false
}
return $true
}
else
{
return $false
}
}
## A simple comparison for complex objects used in JEA configurations.
## We don't need anything extensive, as we should be the only ones changing
## them.
hidden [bool] ComplexObjectsEqual($object1, $object2)
{
$json1 = ConvertTo-Json -InputObject $object1 -Depth 100
Write-Verbose "Argument1: $json1"
$json2 = ConvertTo-Json -InputObject $object2 -Depth 100
Write-Verbose "Argument2: $json2"
return ($json1 -eq $json2)
}
## Convert a string representing a Hashtable into a Hashtable
hidden [Hashtable] ConvertStringToHashtable($hashtableAsString)
{
if ($hashtableAsString -eq $null)
{
$hashtableAsString = '@{}'
}
$ast = [System.Management.Automation.Language.Parser]::ParseInput($hashtableAsString, [ref] $null, [ref] $null)
$data = $ast.Find( { $args[0] -is [System.Management.Automation.Language.HashtableAst] }, $false )
return [Hashtable] $data.SafeGetValue()
}
# Gets the resource's current state.
[JeaEndpoint] Get()
{
$returnObject = New-Object JeaEndpoint
$sessionConfiguration = $null
[int]$tryCount = 0
while (((Get-Service -Name WinRM).Status -ine 'Running') -and ($tryCount -le 10))
{
Write-Verbose -Message 'Starting WinRM service'
Start-Service -Name WinRM
Start-Sleep -Seconds 1
}
$winRMService = Get-Service -Name WinRM
if (($winRMService -ne $null) -and ($winRMService.Status -ieq 'running')) {
#This code will fail if winrm not running
$sessionConfiguration = Get-PSSessionConfiguration -Name ($this.EndpointName + "*") |
Where-Object Name -eq $this.EndpointName
}
if((-not $sessionConfiguration) -or (-not $sessionConfiguration.ConfigFilePath))
{
return $returnObject
}
else
{
$configFileArguments = Import-PowerShellDataFile $sessionConfiguration.ConfigFilePath
$rawConfigFileAst = [System.Management.Automation.Language.Parser]::ParseFile($sessionConfiguration.ConfigFilePath, [ref] $null, [ref] $null)
$rawConfigFileArguments = $rawConfigFileAst.Find( { $args[0] -is [System.Management.Automation.Language.HashtableAst] }, $false )
$returnObject.EndpointName = $sessionConfiguration.Name
## Convert the hashtable to a string, as that is the input format required by DSC
$returnObject.RoleDefinitions = $rawConfigFileArguments.KeyValuePairs | Where-Object { $_.Item1.Extent.Text -eq 'RoleDefinitions' } | ForEach-Object { $_.Item2.Extent.Text }
if($sessionConfiguration.RunAsVirtualAccountGroups)
{
$returnObject.RunAsVirtualAccountGroups = $sessionConfiguration.RunAsVirtualAccountGroups -split ';'
}
if($sessionConfiguration.GroupManagedServiceAccount)
{
$returnObject.GroupManagedServiceAccount = $sessionConfiguration.GroupManagedServiceAccount
}
if($configFileArguments.TranscriptDirectory)
{
$returnObject.TranscriptDirectory = $configFileArguments.TranscriptDirectory
}
if($configFileArguments.ScriptsToProcess)
{
$returnObject.ScriptsToProcess = $configFileArguments.ScriptsToProcess
}
if($configFileArguments.MountUserDrive)
{
$returnObject.MountUserDrive = $configFileArguments.MountUserDrive
}
if($configFileArguments.UserDriveMaximumSize)
{
$returnObject.UserDriveMaximumSize = $configFileArguments.UserDriveMaximumSize
}
if($configFileArguments.RequiredGroups)
{
$returnObject.RequiredGroups = $rawConfigFileArguments.KeyValuePairs | Where-Object { $_.Item1.Extent.Text -eq 'RequiredGroups' } | ForEach-Object { $_.Item2.Extent.Text }
}
return $returnObject
}
}
}
Hi all, This is my proposal of a workaround for this bug: https://github.com/jnury/JEA/blob/issue%2330/DSC%20Resource/JustEnoughAdministration/JustEnoughAdministration.psm1
As I've done a 'lot' of refactoring and would appreciate a code review before filling a PR ;-)
This is what I've done:
- implementing proposal from @djwork (with some small corrections)
- adding restart of services that share the same process as WinRM
- adding WinRM status verification before each call to xxx-PSSessionConfiguration
- improving Verbose messages
@manojampalam It is a shame you have to do this. It would be better for the WinRM service to restart rather than hang on stopping. It might be due to the service host process not being able to restart for some reason. If this is the case then you should be able to ensure WinRM always runs in its own process. I am not familiar with WinRM, but Manoj may be able to help.
We are looking into this now and will follow up.
Hello @manojampalam, any news ?
The workaround shipped in PR #46 is really 'heavy' ... but it was triggered on half of my last deployments, so it's really useful.
Hope you can fix the WinRM restart problem directly in the WinRM service and we can remove the workaround some day.
If it helps: on some of my hosts, it seems that the LanmanWorkstation service (which is co-hosted in the same process as WinRM) ended on an error while WinRM restarted after Register-PSSessionConfiguration.
Hello guys @rpsqrd : have-you been able to have a look at PR #46 ? @manojampalam : have you find something in WinRM ?
Hi jnury, I am Chenming YU, who works with Manoj for WinRM area in Microsoft. Based on the symptom of you situation, I suspect it is similar with one case in past, which the pending action of winrm hang in dsctimer wakeup activities. (hosted inside winrm service). if so, here is The workaround is to call “start-dsc*” with : (either) 1) –force (to make sure that any deadlock between WINRM and WMI breaks by cancelling existing operation). 2) Perform the operation using Dcom protocol instead of Winrm protocol to avoid getting errors when WINRM is transitioning between start-stop-start state.
To checked whether wakeup of dsctimer is activated within WinRM: -
- whether the regkey (below) value : 1 or not HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System:DSCAutomationHostEnabled
- listed files existed under %windir%\system32\Configuration : "MetaConfig.mof", "Pending.mof",
if still repro with those workaround, please chat me with the memory dump of the pending winrm service (svchost.exe, via Taskmgr >> marked process >> "create dump file")
Hi @cmyu-gh, it seems I missed your answer, sorry for that.
I'm not able to use Start-DSC* with -Force as I use the Pull mode, so the configuration is automatically triggered.
Will the second option apply to Pull mode as well or is it only for the Start-DSC* commands ?
sorry about the late response, in your situation, can You check the below regvalue on the repro machines, : HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System:DSCAutomationHostEnabled
if it is '1' or ('2' and %windir%\system32\Configuration*mof existed), then it might prove my suspect on dsctimer plugin of WinRM. otherwise, can you forward me the repro dump of winrm service in hang for advance analysis.
in case of pull control on your case, there is a workaround in directly stop dsctimer plugin within winrm (the effect needs steprestart-service winrm)
HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System:DSCAutomationHostEnabled set it to 0
- after restart winrm service, retry your execution.
**please ignore 2nd option posted before, it is force cimsession protocol via DCOM instead of WinRM -- switch set in some management cmdlets.
Note, this is still an issue on PowerShell 5.1 on Windows 2019 and the workaround of setting HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System:DSCAutomationHostEnabled to 0 seems to cause problems continuing to apply DSC settings after a reboot.