Accept events which fail to parse when 'RenderedText' used
There seem to be instances when if --content-format RenderedText is used in a subscription, windows doesn't send properly formatted XML messages. This in turn causes oenwec to return HTTP error 500 back to the source as it fails to parse the XML (as expected) :
I double checked and I very much doubt the malformed XML is an openwec problem (i'd be happy to hear otherwise), but if you turn on tracing you'll observe that the underlying hyper server reports receiving a malformed payload - which happens before openwec does much processing:
In the above screenshot note how the event is sent with a truncated RenderingInfo XML tag, though it is correct encapsulated in a SOAP event... again leading me to suspect this is a windows problem rather than openwec. This doesnt happen with all events. The same source on the same subscription will send properly formatted events - the issue seems related to those events which contain tabs/newlines/etc in their rendered info (such as most security events)
In this kind of situation, it might be beneficial to accept the event anyway but mark it as malformed, while saving the original content. Like that the source can keep on sending events (rather than stopping at an HTTP 500), and admins can troubleshoot why an event was not accepted.
I realise this is not ideal in all environments since it depends on your use case... in which case maybe the behavior can be controlled by a CLI flag?
I am using this branch to address the above (though the behavior is hardcoded rather than behind a flag as I suggested): https://github.com/CyberSift/openwec/tree/flexible_error
In this branch if the event contains a malformed event it is anyway stored, but we modified the fields stored to contain an error message and the original event content, for example, note the new Error field containing the ErrorMessage and OriginalContent fields:
{
"System":{
"Provider":{
},
"EventID":0,
"Computer":""
},
"OpenWEC":{
"IpAddress":"127.0.0.1",
"TimeReceived":"2023-09-29T07:38:13.575037814+00:00",
"Principal":"demo-client",
"Subscription":{
"Uuid":"91E05B32-F8F6-48CF-8AB4-4038233B83AC",
"Version":"D5FB6FFA-F3FB-4B0D-9655-398EA6114D0F",
"Name":"my-test-subscription"
},
"Errors":{
"OriginalContent":"<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'><System><Provider Name='Microsoft-Windows-Security-Auditing' Guid='{54849625-5478-4994-a5ba-3e3b0328c30d}'/><EventID>5061</EventID><Version>0</Version><Level>0</Level><Task>12290</Task><Opcode>0</Opcode><Keywords>0x8020000000000000</Keywords><TimeCreated SystemTime='2023-09-29T07:25:36.3365627Z'/><EventRecordID>978308</EventRecordID><Correlation ActivityID='{f59bb999-ec5b-0008-f6b9-9bf55becd901}'/><Execution ProcessID='1440' ThreadID='16716'/><Channel>Security</Channel><Computer>xxxxxxxx</Computer><Security/></System><EventData><Data Name='SubjectUserSid'>S-1-5-20</Data><Data Name='SubjectUserName'>xxxxxxxx$</Data><Data Name='SubjectDomainName'>WORKGROUP</Data><Data Name='SubjectLogonId'>0x3e4</Data><Data Name='ProviderName'>Microsoft Software Key Storage Provider</Data><Data Name='AlgorithmName'>ECDH_P384</Data><Data Name='KeyName'>{6DF73E17-F1C9-481A-9B2F-8B16352E5D6E}</Data><Data Name='KeyType'>%%2499</Data><Data Name='Operation'>%%2480</Data><Data Name='ReturnCode'>0x0</Data></EventData><RenderingInfo Culture='en-US'><Message>Cryptographic operation. Subject: 	",
"ErrorMessage":"Failed to parse event XML"
}
}
}
P.S. if anyone can point me in the right direction to actually get WEF to send me full events rather than truncated events I'd be very grateful π
Hi @dvas0004!
I haven't come across such a situation :disappointed: . I do agree that it seems to be more a problem with the Windows event logging system rather than openwec. To confirm this statement and perhaps try to diagnose the problem:
- Did you try to look at the problematic event(s) in Windows Event Viewer?
- Did you try to look at the problematic event(s) directly in .evtx files using a parser such as evtx_dump?
- What version of Windows are you running, with what language packs installed?
- Have you been able to reproduce the problem on other computers?
- Did you search for an event in other channels (such as Applications.../Windows/Eventlog) that might be related to this problem?
As you said, not being able to parse an event will "block" the events stream from that machine for this subscription, which is undesirable in most cases. In fact, except for debugging openwec, I can't find any real-world use cases where the current behavior is better than what you're suggesting. I don't think that the event forwarder is going to "correct" the event it's trying to send because it gets a 500... So I don't even think it's necessary to make it configurable.
If you are willing to create a PR, I would rename Errors to Error since it can only represent one error, and also rename ErrorMessage to Message as it is already in an Error object. Also, in case of an error, I think it would be better not to send the System object at all instead of sending it almost empty... What do you think?
By the way, I would recommend using the Raw content format which seems to be more reliable.
- Did you try to look at the problematic event(s) in Windows Event Viewer?
It appears rendered properly:
- Did you try to look at the problematic event(s) directly in .evtx files using a parser such as evtx_dump?
(PS thanks didnt know about this util! π) They are shown correct but without the RenderInfo field... same as when we use OpenWEC's Raw content type. So probably something is going wrong when the windows event forwarding plugin tries to render the event, then wrap it in a SOAP message for forwarding...
- What version of Windows are you running, with what language packs installed?
windows 11 with US lang. What may be pertinent is that it's windows "home" edition and not joined to a domain
- Have you been able to reproduce the problem on other computers?
i will have the opportunity of testing this later.. at the moment this happens on just one PC (my dev laptop)
- Did you search for an event in other channels (such as Applications.../Windows/Eventlog) that might be related to this problem?
no unexpected error messages
If you are willing to create a PR, I would rename
ErrorstoErrorsince it can only represent one error, and also renameErrorMessagetoMessageas it is already in anErrorobject. Also, in case of an error, I think it would be better not to send theSystemobject at all instead of sending it almost empty... What do you think?
I agree... I modified the System struct to be Option<System>, but I also tried to make a "best effort". Seeing as that you (and I for that matter) have never seen this behavior except for this particular case, I took a stab at "recovering" the event as you can see here:
https://github.com/CyberSift/openwec/blob/ca7e6b5faf0d622d2018fcc3a0c528f544724e07/server/src/event.rs#L130-L154
The resulting "bad" event looks like this after testing:
{
"System":{
"Provider":{
"Name":"Microsoft-Windows-Security-Auditing",
"Guid":"{54849625-5478-4994-a5ba-3e3b0328c30d}"
},
"EventID":4798,
"Version":0,
"Level":0,
"Task":13824,
"Opcode":0,
"Keywords":"0x8020000000000000",
"TimeCreated":"2023-09-29T13:39:08.7234692Z",
"EventRecordID":980236,
"Correlation":{
"ActivityID":"{f59bb999-ec5b-0008-f6b9-9bf55becd901}"
},
"Execution":{
"ProcessID":1440,
"ThreadID":16952
},
"Channel":"Security",
"Computer":"xxxx_xps"
},
"EventData":{
"SubjectLogonId":"0x3e7",
"TargetDomainName":"xxxxx_xps",
"CallerProcessId":"0x28d4",
"CallerProcessName":"C:\\\\Windows\\\\System32\\\\svchost.exe",
"TargetUserName":"davev",
"SubjectDomainName":"WORKGROUP",
"SubjectUserName":"xxxxx_XPS$",
"TargetSid":"S-1-5-21-1604529354-1295832394-4197355770-1001",
"SubjectUserSid":"S-1-5-18"
},
"OpenWEC":{
"IpAddress":"127.0.0.1",
"TimeReceived":"2023-09-29T14:33:12.574363325+00:00",
"Principal":"demo-client",
"Subscription":{
"Uuid":"91E05B32-F8F6-48CF-8AB4-4038233B83AC",
"Version":"523D1886-E73E-4A96-A95D-F0326CB282F0",
"Name":"my-test-subscription"
},
"Errors":{
"OriginalContent":"<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'><System><Provider Name='Microsoft-Windows-Security-Auditing' Guid='{54849625-5478-4994-a5ba-3e3b0328c30d}'/><EventID>4798</EventID><Version>0</Version><Level>0</Level><Task>13824</Task><Opcode>0</Opcode><Keywords>0x8020000000000000</Keywords><TimeCreated SystemTime='2023-09-29T13:39:08.7234692Z'/><EventRecordID>980236</EventRecordID><Correlation ActivityID='{f59bb999-ec5b-0008-f6b9-9bf55becd901}'/><Execution ProcessID='1440' ThreadID='16952'/><Channel>Security</Channel><Computer>dvas0004_xps</Computer><Security/></System><EventData><Data Name='TargetUserName'>davev</Data><Data Name='TargetDomainName'>xxxxx_xps</Data><Data Name='TargetSid'>S-1-5-21-1604529354-1295832394-4197355770-1001</Data><Data Name='SubjectUserSid'>S-1-5-18</Data><Data Name='SubjectUserName'>xxxxx_XPS$</Data><Data Name='SubjectDomainName'>WORKGROUP</Data><Data Name='SubjectLogonId'>0x3e7</Data><Data Name='CallerProcessId'>0x28d4</Data><Data Name='CallerProcessName'>C:\\\\Windows\\\\System32\\\\svchost.exe</Data></EventData><RenderingInfo Culture='en-US'><Message>A user's local group membership was enumerated. Subject: 	",
"ErrorMessage":"Failed to parse event XML"
}
}
}
if you like this approach i'll issue a PR next week πotherwise i'll revert the behavior to not sending an empty System struct
By the way, I would recommend using the
Rawcontent format which seems to be more reliable.
you are very right... though after a very quick survey of SOC analyst colleagues, they find the rendered field very useful, and they would like to keep it if possible....
It appears rendered properly:
The fact that the Event Viewer formats it correctly is unexpected. I would assume that the code that builds RenderingInfo would be the same in both cases (event viewer and event forwarding), but maybe it isn't, or the problem comes from somewhere else...
windows 11 with US lang. What may be pertinent is that it's windows "home" edition and not joined to a domain
I installed a Windows 11 Home VM, version 22H2 (build 22621.1413) with the US language pack, but I couldn't reproduce the problem...
- Can you share your Windows build number?
- I assume that you are using TLS encryption?
- Can you share your subscription settings (the output of
openwec subscriptions show <name>)?
I agree... I modified the System struct to be Option<System>, but I also tried to make a "best effort". Seeing as that you (and I for that matter) have never seen this behavior except for this particular case, I took a stab at "recovering" the event as you can see here
The "best effort" approach is interesting, especially if the problem occurs regularly (does it?). The Event schema specifies that RenderingInfo must be the last "specified" node of an Event (see event.xsd), so we shouldn't miss anything.
However, I think the implementation can be improved:
- I would prefer to use
rsplit_once(https://doc.rust-lang.org/std/string/struct.String.html#method.rsplit_once) to split on the last occurrence of<RenderingInfo, even if it should be the only one (if some other fields contain a"<RenderingInfo"string, it would be encoded as<RenderingInfo). With this in mind, I'm not sure about callingEvent::from_strrecursively since we should only try it once. - Comments should be added to explain why we need to do this sometimes
- Some debug logging messages would be a great addition
- We should add some unit tests to cover these cases
What do you think of the following implementation: https://github.com/vruello/openwec/commit/92f20d374c5225ddbae838d3d025ae5bfae95ff2? I've used the data you provided to create test cases, I hope it's ok for you.
Can you share your Windows build number?
Edition Windows 11 Home Version 22H2 OS build 22621.2283
I assume that you are using TLS encryption?
That's right! Though I doubt the issue is here, as there is no indication of HTTP(S) level issues... the event isnt exactly truncated, only the RenderingInfo field is truncated and then wrapped in valid SOAP
Can you share your subscription settings (the output of openwec subscriptions show
)?
./target/release/openwec -c /etc/openwec.toml subscriptions show my-test-subscription
Subscription my-test-subscription
UUID: 91E05B32-F8F6-48CF-8AB4-4038233B83AC
Version: 523D1886-E73E-4A96-A95D-F0326CB282F0
URI: Not configured
Heartbeat interval: 3600s
Connection retry count: 5
Connection retry interval: 60s
Max time without heartbeat/events: 30s
Max envelope size: 512000 bytes
ReadExistingEvents: false
Content format: RenderedText
Ignore channel error: true
Principal filter: Not configured
Outputs:
- 0: Enabled: true, Format: Json, Output: Redis(RedisConfiguration { addr: "127.0.0.1:6377", list: "wec" })
- 1: Enabled: true, Format: Raw, Output: Files(FileConfiguration { base: "/data/logs", split_on_addr_index: None, append_node_name: false, filename: "messages" })
Enabled: true
Event filter query:
<!-- query.xml -->
<QueryList>
<Query Id="0">
<Select Path="Application">*</Select>
<Select Path="Security">*</Select>
<Select Path="Setup">*</Select>
<Select Path="System">*</Select>
</Query>
</QueryList>
Worth noting I see the same issue on both outputs (files and redis)...
What do you think of the following implementation: https://github.com/vruello/openwec/commit/92f20d374c5225ddbae838d3d025ae5bfae95ff2? I've used the data you provided to create test cases, I hope it's ok for you.
This is phenomenal - nicely done!!!! πͺ It is absolutely fine by me and if you need any other data or any contribution from me let me know!
The "best effort" approach is interesting, especially if the problem occurs regularly (does it?)
On my test setup, almost all the "security" event have this issue, but i suspect it's because these issues tend to have a lot of newlines and tabs in their rendered content. So it happens very often - in fact I came up my initial implementation of the "recoverable error" to see if it affected all event or just some.... (it affects some, but not all)
Could you enable the Applications and Services logs/Microsoft/Windows/Windows Remote Management/Analytic channel and look at the contents of events 771 (which contain the raw payload sent by the forwarder)? In particular, we want to make sure that it is already malformed here :)
This is a bit more work on your end, but it would be interesting to see what happens with an official Windows "wec" server. I would happily do it myself but for now I can't reproduce the problem...
Could you enable the Applications and Services logs/Microsoft/Windows/Windows Remote Management/Analytic channel and look at the contents of events 771 (which contain the raw payload sent by the forwarder)? In particular, we want to make sure that it is already malformed here :)
Confirming it's already malformed there:
This is a bit more work on your end, but it would be interesting to see what happens with an official Windows "wec" server. I would happily do it myself but for now I can't reproduce the problem...
time permitting i'll try this and let you know π
Hi @dvas0004! Have you found some time to try and reproduce this behavior with a Windows Server WEC?
hey @vruello !!!
unfortunately not ππ but please keep this issue on me and I will try especially if I manage to get a test setup up