Frontend Load Test

Frontend Load Test

Assign
Date
‣
Status
Completed
Property

Background

Meteor can be resource hungry. This happened during your pen-test track and you were the only 1 operator. That said, it is unlikely a single Case can accumulate that many events, without you repeatedly "attacking" the target.
notion image

Question

  • At what point does the wekan boards stall?
The screenshot below is the resource usage statistics for Wekan when the card score was 300.
notion image
When there is a low number of unclosed closes, the resource usage in Wekan is low.
However, loading issues start to occur when there is a large accumulation of unclosed cases. When the total score of the card reached 22000 on my OpenEDR backend VM, the Wekan webpage was stuck in a loading loop when I attempted to refresh the page. It took around 10 minutes for the page to eventually load finish.
notion image
The screenshot below is the resource usage statistics for Wekan when it started to have loading issues.
notion image
When comparing the resource usage statistics before and after the occurrence of loading issues, we can see that there is a big difference in terms of the CPU and memory usage. Loading issues start to occur when the CPU usage reaches 30% and when memory usage reaches 19%. The higher the number of unclosed cases, the more resources are needed to support Wekan, which then causes the webpage to not load. From this test, it showcases a flaw of Wekan i.e. suffers from performance issues even though the CPU & memory usage is not very high.

Jym's Comments

Basis of Scoring

The higher the score, the more anomalies were sighted on a particular endpoint. In some SOCs, Sev1 or Severity 1 is most serious or highest severity level. I personally find it confusing since in natural language (at least in English); if we prioritise based on increasing severity, so why is 4 less severe than 1?
In my model, Stage 1 is least to worry; eg. Internet kiosks that will reboot to clean state, & automatically triggered by DFPM.exe, thus the wekan-label 'rebooted' to indicate that kiosk had rebooted. If we are watching Windows Servers, & there's a custom logic for let's say scanning, that kind of event will also be Stage 1.
The scoring mirrors the ALC Tactical Model, eg. if we see a new driver file & driver files are typically installed through privileged processes, then we say it is Severity-3 (defenders' POV) & Stage-3 (attackers' POV). A backend script assigns per-anomaly score: https://github.com/jymcheong/OpenEDR/blob/78d52663064fea6a5aa0bd2600db733be5ede034/backend/startDetection.js#L4

"Threat" Score of 22000!!! is it possible?

Suppose we had a 100-endpoints network, a score of 22000 would mean on average of 220 point endpoint. Stage-2 is set at 20 per anomaly, that means each endpoint has 11 Stage-2 anomalies if we evenly distribute it out for the sake of discussion. Perhaps a Ransomware worm that is burning through a network? Maybe... but the output on the Wekan board will be 100 cases, each with a lot less cards than let's say single case (1 host) with many case-events linked to it.
But is such a scenario of high-case-count but with less cards per case still stalling the UI?

Performance of Dashboards vs Ease-of-Interpretation

These 2 factors are usually related.
🧠
The most important lesson that I want you to take home when it comes to frontend design is: Information Seeking Mantra.
The Wekan loading-performance issue is secondary to a certain extend, since if I had done some more processing to explain the cases further, so as to minimise the loading of so much data. Unfortunately with talent-constraint, I took the easier way out to link raw events to the case, & let the humans to decide after they acquired a framework of mental models.
🤔
There's another interesting problem related to this topic: to what extent do we want the machine or "black-box" to decide for us? The major pain-point with many detection products is that many fail to explain why it emitted "BAD" alerts. Much of the events, decision algorithms & so on are hidden from operations' view (in the name of Intellectual Property rights), which in turns lead to "trust issues".
Recorded Future wrote a good entry on this topic: https://www.recordedfuture.com/information-seeking-mantra/
Carrot2 search exemplifies some of ISM concepts in action: https://search.carrot2.org/#/search/web
notion image
Â