Why is Cybersecurity Risk Different?
Should business executives treat cybersecurity differently than other risk centers? It must be different, otherwise why it is so hard to answer even simple questions about cybersecurity spending such as what should we spend and what should we spend it on? But, why is this so? This is not rocket science, is it? No, it’s not, but not in the way you are thinking. With all due respect to my Dad (he literally is a Rocket Scientist), by treating cybersecurity as a “special risk,” we’re making answering these simple questions more complicated than making rockets fly.
To Infinity and Beyond
I started this journey trying to answer two simple questions: what should we spend on cybersecurity and what should we spend it on. These answers seem so elusive and therefore I figured we must need some new perspective or approach specific to cybersecurity spending.
From my first post, most people use ROI to justify cybersecurity spending. A good example is the Booz Allen model. From the second post I showed how ROI (or Return on Security Investment (ROSI)) are not good metrics to use to justify cybersecurity spending; in fact, any type of spending. We need to take our economics discussion up a notch and focus on using NPV (Net Present Value) and/or IRR (Internal Rate of Return) rather than ROI/ROSI. In my third post I outlined a standardized way to qualify and quantify risk: Factor Analysis of Information Risk (FAIR). Yes, a standardized approach that does not treat cybersecurity any differently than other areas of risk! Because of this, organizations using FAIR are developing a standard lexicon to discuss cybersecurity risk in terms that their risk management peers understand. With FAIR, business executives can assess cybersecurity risk with the same scrutiny, maturity and transparency they assess other forms of organizational and institutional risk.
In this post we’re diving a bit deeper into FAIR and focusing on how we can start using FAIR to help make cybersecurity investment decisions.
As a quick refresher, in Open FAIR, risk is defined as the probable frequency and probable magnitude of future loss. That’s it! A few things to note about this definition:
- Risk is a probability rather than an ordinal (high, medium, low) function. This helps us deal with our “high” risk situation discussed above.
- Frequency implies measurable events within a given timeframe. This takes risk from the unquantifiable (our risk of breach is 99%) to the actionable (our risk of breach is 20% in the next year)
- Probable magnitude takes into account the level of loss. It is one thing to say our risk of breach is 20% in the next year. It’s another thing to say our risk of breach is 20% in the next year resulting in a probable loss of $100M
- Open FAIR is future-focused. As discussed below, this is one of its most powerful aspects. With Open FAIR we can project future losses, opening the door to quantifying the impact of investments to offset these future losses
As shown in Figure 1, the Open FAIR ontology is pretty extensive and this post isn’t the place to get into all the inner workings. I urge everyone to learn more about FAIR.
Figure 1 – Open FAIR High-Level View
As discussed in my last post, risk is determined by combining Loss Event Frequency (LEF) (the probable frequency within a given timeframe that loss will materialize from a threat agent’s actions) and Loss Magnitude (LM) (the probable magnitude of primary and secondary loss resulting from a loss event).
At a Loss
To date, I’ve mostly focused on the loss event frequency (LEF) side of the risk equation, specifically to tease out the intricacies of threat and vulnerability In this post, I’m shifting the focus to the loss magnitude (LM) side of the risk equation because I believe the ability to project a realistic loss magnitude is the foundation of a quantitative risk analysis. Based on my discussions with cybersecurity executives, it’s often the toughest thing to quantify because quantifying loss magnitude can only be done with extensive communication with other parts of the business; parts that quite often have never interacted with IT and cybersecurity before. This is one of the main reason I say this is not rocket science. It’s harder!
How do we define loss? Booz Allen model defines cost to fix, opportunity cost and equity loss. These are pretty broad categories and the broader the measure, the more difficult it is to quantify the potential loss. We need more granularity, but not too much granularity. If we get too granular the whole process may collapse under its own weight.
In terms of granularity, Open FAIR calculates six forms of loss, covering primary and secondary loss.
Primary Loss is the “direct result of a threat agent’s action upon an asset.” Secondary Loss is a result of “secondary stakeholders (e.g., customers, stockholders, regulators, etc.) reacting negatively to the Primary Loss event. In other words, it’s the “fallout” when the s*&t hits the fan.
The secondary loss magnitude is losses materializing from dealing with secondary stakeholder reactions. To me, this is a critical distinction of Open FAIR versus other models. We can’t assume that secondary losses will always occur.
Ponemon Cost of Breach Study Example
The best work I’ve seen on cost of breach is the annual studies performed by Dr. Larry Ponemon and his team (www.ponemon.org). Since 2005, they have been tracking costs associated with data breaches. To date, they have studied 1,629 organizations globally. Some of the key findings from the 2015 study are in Table 1:
In Open FAIR terms, Ponemon is saying the average loss event (LE) frequency of a 10,000 record breach is .22 over two years with a Loss Magnitude (LM) of $1.54M (10,000 records x $154/record). Similarly, Ponemon states the average Loss Event Frequency (LEF) of a 25,000 record breach is approximately 0.10 over two years with a LM of $3.85M.1
From this we can determine the Aggregate Loss Exposure (ALE). Typically, the ALE is an annualized number so if we assume Ponemon saw an even distribution over two years, we develop an ALE for a 10,000 record breach of approximately $169K. This is a lot smaller than the oft-quoted $3.79M average cost of breach. Shifting the discussion from Loss Magnitude (LM) to Aggregate Loss Exposure (ALE) changes the whole tenor of the conversation.
Not There Yet
This is very helpful information, but it’s not precise enough to make clear quantitative risk decisions. I suspect, Ponemon has much more information than what is published in the report and hopefully, it includes the following key points:
- The distribution of the primary Loss Event Frequency (LEF) and Loss Magnitude (LM). We know the average, but to make decisions we really need the Min, Max and Mode. For example, an average of .22 is only relevant if you know the shape of the distribution curve. Is it peaked or flat? The sharpness of the curve defines the level of confidence we have with the data. To assess this, we need to compare the average to the mode: the closer the two, the higher the level of confidence.
- The relative frequency of primary to secondary events. Though Ponemon does tease out the two types of losses (e.g. he differentiates between direct and indirect costs), it isn’t as well differentiated as an Open FAIR analysis. Lumping the two together can skew the results dramatically.
- Separating Primary and Secondary Loss Magnitude (LM). This is related to above.
As an example, check out Figure 4, a sample risk analysis done by RiskLens.**
In this example, we’re looking at pretty steep distribution curve where the average and peak (mode) are fairly close. The Aggregate Loss Exposure (ALE) is made up of multiple loss (primary and secondary) scenarios. This analysis is developed from 118 individual risk scenarios covering 32 asset classes and 5 threat communities. For example, one individual risk scenario might be the loss of 10,000 records due to a data breach caused by weak authentication controls, contributing $169K to the ALE. As mentioned above, the ALE is a factor of both the Loss Event Frequency (LEF) and Loss Magnitude (LM).
Figure 4 contains a ton of information. First, the chart shows a Risk Appetite (RA) of $130M for the organization. Just looking at the curve shows the RA is less than both the peak (mode) and average ALE. The chart also shows the 10% and 90% distribution points. Many CFOs look at the 90% line as the worst case ALE scenario (essentially equivalent to Ponemon’s $3.79M cost of data breach). In other words, on average we expect an ALE of $223M but to prepare for the worst, we should prepare for an ALE of $391M.
We can further break the average ALE down into primary and secondary LM components (see Figure 5).
Now what? In this organization’s case, the secondary loss elements are far larger than the primary loss elements and the bulk of the materialized loss relates to loss of confidentiality (Figure 6).
Now, what do we do with this information? How do we turn charts into actionable guidance? Right off the bat, we have a fundamental problem because our Risk Appetite (RA) is significantly lower than peak and average ALE. We have three main choices: raise the RA (rarely the best option), outsource a significant chunk of the risk by buying cyber insurance, and implement controls to lower the ALE below the RA.
Control Your Destiny
Open FAIR only defines four classes of controls: vulnerability, avoidance, deterrent, and responsive controls. In comparison, NIST defines 17 categories of controls; many could be considered a cross of avoidance, deterrent and vulnerability controls.
Having only four broad classes of controls makes performing “what-if” analyses practical. It also provides a framework to determine control selection based on most significant ALE impact.
Figure 7 – Mapping Open FAIR Controls to Ontology
To determine the most effective controls, we need to determine the threat communities with the greatest impact on ALE. For example, from the above RiskLens example, we can break down the average $223M ALE by specific threat communities (these need avoidance and deterrent controls).
Assessing Impact of ControlsThe value of this knowledge is HUGE! Without even talking control specifics, I know that more than half of my expected loss will be from insiders (privileged and non-privileged). This tells me to turn to NIST and focus on access controls (AC), audit and accountability (AU) and security awareness and training (AT) controls!The analysis indicates the greatest loss exposure is from the privileged insider community (approx. 43% of the total average ALE), cyber criminals (approx. 36%) is second, non-privileged insiders (approx. 13%) is third, and it goes down from there.
Assessing the Impact of Controls
Cleveland, we still have a problem. Our risk appetite is well below our average and peak ALE. I don’t want to raise our RA so we must reduce the ALE. But, how can we determine which of the above controls (AC, AU or AT) are most effective? The beauty of using Open FAIR with an analytic and modeling engine (e.g. RiskLens) is we can simulate the potential impact of security controls on quantitative risk. This is something that most organizations do not do. Instead they simulate the potential impact of security controls on qualitative risk. I’ll get to this in my next post when we dive into SANS 20 controls as a model to assess qualitative impact of security controls.
The beauty of using quantitative analytics is it opens the door to effective economic discussion. For example, the yellow curve in Figure 9 depicts our initial ALE (this is a different analysis from figure 4 though the curves are very similar). The blue curve shows the projected ALE after the implementation of both avoidance and deterrent controls: controls that we know the cost of!
This is a pretty extreme example to illustrate how this stuff works. In reality, most organizations will already have a significant investment in controls reflected in the baseline analysis. The exercise will be a series of incremental control tweaks to bring the ALE in line with the RA. After all, any investment that brings the ALE below the RA cannot be economically justified.
A Final ALE Perspective
To me, this is very exciting! If we run simulations against different control classes we can figure out our best control investment strategy. We can plot the control costs against the ALE impact to pick the winning approach. We can then evaluate the NPV and IRR of the control investments as a function of the ALE to build a business case for cybersecurity control investment. We can also directly compare the cost of implementing controls against the cost of buying cyber insurance. Essentially, with this information – plus the insights of the Gordon-Loeb cybersecurity spending model – we can make intelligent decisions about cybersecurity spending. And, most importantly, we can discuss these spending analyses on equal terms with any other form of business risk analysis.
Disclaimer – I have no financial or formal business relationship with RiskLens. I do have the utmost of respect for Jack Jones, RiskLens Founder, and I’m very much appreciate his support and willingness to share output from his analysis tools.