AddEvent Throughput

AddEvent Throughput

Assign
Date
Status
Completed
Property

Background

  • insertEvent.js current calls AddEvent, a ODB function to insert a json event line.
  • ProcessCreate is the primary event of concern because all other types of Sysmon events are link to it

Goals

  1. Using the following test function provided, measure the insertion performance
//shortcut to get ticks in milliseconds since epoch var s = (new Date())*1 print('start: ' + s) print(Date()) for(var i=0;i < 10000;i++) test_ProcessCreate() var e = (new Date())*1 print('end: ' + e) print(Date()) var d = (e - s)/1000 print('====') print('delta in secs ' + d) print('====')
test_ProcessCreate() → function to generate a self-created ProcessCreate event
Using the codes above, we created a new function to measure how long it takes to insert 10000 ProcessCreate events.
Results:
First attempt: It took 100.776 seconds to insert 10000 ProcessCreate events.
notion image
Second attempt: It took 92 seconds to insert 10000 ProcessCreate events.
notion image
Third attempt: It took 96.509 seconds to insert 10000 ProcessCreate events.
notion image
On average, it takes around 96 seconds to insert 10000 ProcessCreate events.
*Note that test_ProcessCreate() function always generate an Orphan ProcessCreate
notion image
 
2. Propose how modify any test other server functions to measure the total time taken for complete processing such that there are no more ProcessCreate with ToBeProcessed = true for both cases:
(a) Orphan ProcessCreate
(b) ProcessCreate that can be linked
var db = orient.getDatabase(); //shortcut to get ticks in milliseconds since epoch var s = (new Date())*1 print('start: ' + s) print(Date()) for(var i=0;i < 10000;i++){ test_ProcessCreate() } var count = db.query('SELECT count() from ProcessCreate where ToBeProcessed = true') count = count[0] cleaned = count.toString().replace(/[{count():}]/g, ""); // Only continue if all events have been processed while(cleaned > 0){ var count = db.query('SELECT count() from ProcessCreate where ToBeProcessed = true') count = count[0] cleaned = count.toString().replace(/[{count():}]/g, ""); } print("All events have been processed!") var e = (new Date())*1 print('end: ' + e) print(Date()) var d = (e - s)/1000 print('====') print('delta in secs ' + d) print('====')
Next, I will be testing if the modified script affects the time taken for insertion + processing of 150 linkable ProcessCreate events.

Performance of Original Script

Time taken for insertion: 2.914 seconds
notion image
Time taken for processing: 4 minutes 35 seconds
Total time: 4 minutes 38 seconds

Performance of Modified Script

Total time: 8 minutes 2 seconds
notion image
Even though the modified script is able to calculate the insertion + processing time, the constant querying for the count will cause a much longer processing time. One solution to this issue would be to query at fixed intervals instead.

Orphan ProcessCreate Processing

*Note that I am using the original script to measure the time taken for Orphan/Linkable ProcessCreate Processing i.e. manually checking if all events have been processed.
Insertion complete time:
notion image
Queried for count immediately after all events have been inserted:
notion image
notion image
Once a ProcessCreate event is assigned ProcessType Orphan, the event will not be processed i.e. ToBeProcessed is already set to false.

Linkable ProcessCreate Processing

*I changed the ParentProcessGuid in test_ProcessCreate function so that all events becomes linked to previous ProcessCreate.
Insertion start time:
notion image
Insertion complete time:
notion image
It took 251.212 seconds to finish inserting 10000 linkable ProcessCreate events (excluding linking).
 
Fully processed time i.e. all events have been processed: Wed Jun 16 2021 13:41:08 GMT+0000 (UTC)
Time taken to process all events = 2 hours 12 minutes 36 seconds
As expected, inserting a large number of linked ProcessCreate events at one go causes the processing speed to be slow. Looking at the ODB console in real time, 1 event gets processed every 1 second.
 
Total time taken (Insertion time + Process time) = 4 minutes 11 seconds + 2 hours 12 minutes 36 seconds
= 2 hours 16 minutes 47 seconds
 

Jym's Comments

  • Ingestion or insertion into queue does not always necessary lead to complete processing of a given set of events
  • Why 10,000? It's just an arbitrary number I picked... but we can be more precise:
notion image
From the first smss.exe to user getting into Explorer shell, there are 134 processes... for a typical Windows 10 endpoint
notion image
That's just ProcessCreate, there are a number of different types of Sysmon events that are trigger PER ProcessCreate:
notion image
So taking 10000/211 = 47.... about 47 endpoints booting up simultaneously is one way to create such a surge. But bear in mind the timing results recorded are based on a backend that is running in a puny VM within YJ's Win10 Home laptop!