Root Cause(s) Analysis for Missing Edges on YJ's Windows VM

Two types edges are missing with Yang Jun Windows VM, yet I can't reproduce on my Win10-Home & Win10-Pro test machines.
In summary, 2 fields mismatched resulted to CheckSpoof function returning early at line 5 & thus missing edges related to spoofing detection.

SpoofedParentProcess Edge

Hostname case mismatched

This pertains to the edge class SpoofedParentProcess. Even in an event that True-Parent ProcessCreate is somehow missing, we should still be linking this edge & this will be reflected in Investigation Board. However, it was totally missing in YJ case, thus deserve further troubleshooting. From below, we can see ProcessGuid matches but NOT Hostname. All string matching is CASE-SENSITIVE, so is the class-property name.
notion image
Most of the time, SpoofParentProcessId record enters ODB first, followed by ProcessCreate because of the latency of Nxlog rotation. This triggers the following ODB function right after the spoofed ProcessCreate is inserted:
notion image

ToBeProcessed Field is already true by the time of Spoof match-retrieval

Amongst these 4 select conditions shown above, matching based on ToBeProcessed = true also will fail since a ProcessCreate.ToBeProcessed may be set to true by asynchronously by ConnectParentProcess ODB function.
notion image
 
Since cmd.exe & powershell.exe have spoofed parent of Explorer.exe, Explorer.exe record had no issue with ParentOf edge linking, which set ToBeProcessed to false.. This ends up a non-deterministic timing problem. In my 2 endpoints, the timing were in my favour, thus no problem observed. Not so for YJ's VM.
 
notion image
 

TrueParentOf Edge

SVCHOST.exe is missing

When we run-as admin, like for cmd.exe or powershell.exe, the true parent is a SVCHOST.exe. There are many SVCHOST processes but they differ in Commandline start up parameters:
notion image
 
  • This was observed on my machines too, a number of SVCHOST.exe are missing. We can infer that by observing occurrences of "recoverSeq|...." from ODB console output. Most of the time, we are able to recover the sequence. In YJ's case, this specific SVCHOST.exe is ALWAYS missing. Even so, this only affects TrueParentOf, & not SpoofedParentProcess edge. So this is a compound problem that needed to be divided-&-conquered.
  • Upgrading to Sysmon v13.x reduced the occurrences but it is still not perfect. Will remove ProcessCreate filtering conditions from smconfig.xml.
  • In very bad cases, services.exe & wininit.exe are always missing. A way to check is count the number of smss.exe that is ParentImage = 'System', then count the number of your services.exe, they should match.
  • This is very much a Sysmon problem. Another hypothesis is that as smconfig.xml has MORE filtering conditions, it adds more load to the process. Booting up is actually quite an intensive process & time-critical recording of events may be affected with too many filtering. Which was why I proposed a reduction of ProcessCreate filters, which is minimal after adapting from Swift-on-security community version of Sysmon config: https://github.com/SwiftOnSecurity/sysmon-config/blob/master/sysmonconfig-export.xml

Mitigations & Fixes

  1. To mitigate missing "early" processes (by early we mean those starting before Explorer.exe), we should turn on Audit EventID 4688 with recording of CommandLine parameters.
  1. Test removing of ProcessCreate filtering from Sysmon configuration.
  1. Upper-case Hostname before inserts, already did that for all ProcessGuid & variants eg. SourceProcessGuid etc. No point selecting with ToUpper() function since it negates the benefits of indexing.
  1. Remove ToBeProcessed = true condition for CheckSpoof. Interesting that I did not add that condition in SpoofParentProcessId function. I am not perfect, heh!
 

Follow-up

  • Removed ToBeProcessed = true condition
  • Added Hostname upper-casing within AddEvent
  • Updated CSG1 cloud OpenEDR instance
 

Sysmon Filtering

Remove ProcessCreate filtering

  • removed PC filtering on Win10 dev machine, still missing early process
  • removed networkConnect filtering since it is quite a large set of rules; still missing but recovered sequence
  • removed MORE explicit include rules; notice ALOT more event files rotated! after reboot easily hit 650+ backlog of event files
  • no more missing early process except 1 related to zerotier
  • networkConnect & registryEvents are huge volume!
https://gist.github.com/jymcheong/0ec2ae2a729d4474331d6a64feb68bc3 has a very "tight" set of include conditions. If attacker choose a tool that is not in that list, then networkconnect for that tool will not be recorded.