Cybersecurity is increasingly important for many organizations. It manifests itself as business risk. Security operations are a key security capability that organizations must implement to be effective in deterring and resolving the effects of cyber-attacks and minimize cybersecurity risk to their business. However, the role and mechanics of security operations is often misunderstood. That is why you are reading this book.
This book is written from a viewpoint on cybersecurity that, for some, turns matters on its head . I take the view that cybersecurity operations, when done well, drive security leadership, auditing, reporting, and risk reduction. This is not the common view on how organizations implement cybersecurity operations. The usual approach, sketched very briefly, is that organizations need executive commitment, funding, a cybersecurity program, often driven by audit results, and a raft of security policies and risk heat maps to be effective. Their job is then to drive this down into the business. The measurement of this is then done with maturity models and metrics.
This book will overturn that view. The viewpoint that I will develop and work out in this book is the following:
- Passing audits is the result of security operations done well. Audits do not drive improvement – making improvements in security operations drives improvement overall.
- Security operations vitally develop and enrich cybersecurity conversations at executive level mainly through the enhanced visibility they provide. Having a conversation about what happens on your network as opposed to what one reads about in the newspaper is inherently more powerful and convincing, especially if it can be backed up with evidence.
- The visibility and context provided by well-executed cybersecurity operations inherently changes the strategy and risk discussion, leading to better grounded risk and compliance programs.
- Building in the visibility and response components into applications and networks from the outset leads to better security architecture and changes the conversation from security being a blocker to security being an enabler of the business.
- If security operations are the core of an organization's cyber risk management, then the activities undertaken to resolve security incidents are at the heart of security operations. The viewpoint that I will take in this book, and that in my view defines agile security operations, is that effective incident response is the key measure when it comes to risk reduction from threats. In turn, the need to perform incident response then drives the rest of the security operations.
The operations piece of cybersecurity also needs funding, commitment, policies, and risk management. Doing cybersecurity operations well is not an excuse to get rid of these things. The difference is a radically changed conversation about their impact and use. Cybersecurity operations, done well, provide a vital context and enrichment to the executive and business conversation that will lead to a tight integration between cybersecurity and the business, reduce risk more effectively, and, in short, lead to an organization that is defensible from a tooling (technical), cultural (people), and management (process) perspective. The part between brackets is sometimes referred to as the people, process, and technology (PPT) framework.
The focus of this chapter is on the following:
- Understanding the role of security operations in risk management
- Defining security operations
- Understanding why security operations need to be agile
The chapter is structured as follows:
- Why security is hard
- Security incidents
- Security solutions in search of a problem
- The scope of security operations
- Where security operations turn agile
Why security is hard
In many organizations, implementing security is hard work. At a technical level, security is often seen as a blocker, at a tactical level, security considerations may change how the business operates, and at a strategic and political level, security often raises problems that many organizations prefer to ignore. This section will place security operations at the core of a security program and introduce the five types of cyber defense.
This book takes the view that security operations are the heart of a security program. When organizations do their security operations well, they generate the necessary context to develop strategy, policies, and reporting, and gain the most benefit from audits.
The centrality of security operations is a somewhat unpopular view: much of what we see in security writing, focuses heavily on technology – which is the implementation side of security – or strategy, which focuses on the management and maturity of the program. By not considering security operations, the focus of too many organizations is still on prevention and controls. While prevention and controls are important, in this book I argue – based on experience – that they are the result of good security operations rather than the cause.
In a nutshell, security operations are an organization's capability to detect and respond to adversarial events on their systems and networks.
That is a mouthful, but we can unpack this a bit. Detection speaks to the capability of an organization to notice that something is wrong on their networks, preferably in an early stage of an attack, respond speaks to their capability to deal with such an event. Adversarial indicates that the event is caused by humans and has a specific component of intent.
In this book, I'll focus specifically on security operations and the ethos needed to create and sustain a security team that excels in security operations.
Therefore, I'll stay away from talking too much about either technology and strategy and instead focus heavily on tactics. Tactics – the specialty of security operations – is the nitty-gritty of how organizations respond to actual attacks, threats, vulnerabilities, and adversarial activity on their systems and networks.
If you think of strategy as the why of security, and the technology as the what, then tactics is the how – how do we realistically implement a risk program, how do we use that technology that has just been bought, and how do we secure an enterprise? These are the questions I will aim to answer in this book, and it is a critical connecting layer between technology and strategy that has not received the attention it deserves.
Cybersecurity, threats, and risk
Cybersecurity is traditionally approached from the viewpoint of business risk management. This creates a disconnect with security operations, and that fundamental disconnect makes security in many organizations harder than it needs to be.
To understand this better, we can look at how risk management usually approaches areas of risk. While the view of risk management I develop here is very simplified, it captures all the essentials. Risk management is typically based on a risk register, where risks are enumerated and given a priority of high, medium, or low (or a color-coded scale) based on both the exposure to the risk (the likelihood) and the impact (the consequence). In most cases, these assessments are subjective and dependent on the sector and context.
Risk management then relies on a matrix of controls to manage risk. Broadly speaking, risk treatment has four options: prevention, reduction, acceptance, or transfer. Prevention means that the organizations put in a device or measure that prevents the risk from materializing. Reduction means that some compensating control is developed that controls the risk, or at least make it visible in time.
Acceptance of risk means just that – the risk is accepted by the organization and no further action is undertaken to address it; consequences will have to be dealt with as they occur. This can happen, for instance, when a risk is too costly or cumbersome to address, or when the costs and effort associated with addressing it make no sense from the viewpoint of the risk accepted.
A transfer of risk occurs when the risks are borne by a third party, for instance in the case when an organization buys cyber insurance. We will have more to say on cyber insurance in Chapter 7, How Secure Are You? – Measuring Security Posture
Once this table is complete, risks are then prioritized, mitigations costed and budgeted, and the budgets for the highest risks are approved. Then it's rinse and repeat.
Measuring cybersecurity risk
While you might think that risk management is a typical business way of dealing with the risks posed by cybersecurity and is therefore easily understood by senior leaders in an organization, you would be wrong. In How to Measure Anything in Cybersecurity Risk, Wiley, 2016, Douglas Hubbard and Richard Seiersen argue passionately and in depth that this method of dealing with risk is a failure and does not work. While cybersecurity is indeed a business risk, we need to come up with a better method to communicate and treat risk. In Chapter 7, How Secure Are You? – Measuring Security Posture, we will return to the topic of how to make security relevant in a business context based on the model of security operations.
Security operations do not work this way. Security operations focus primarily on dealing with issues as they occur – that is, they focus on the here and now. Beyond the here and now, they focus on threats in the context of the business, and devise methods of detecting those threats.
To better understand the depth of the chasm that opens in this way, it helps to have a clear understanding of how organizations deal with cyber risk. Dealing with cyber risk from the perspective of a risk management framework leads an organization to put in passive defenses: things such as firewalls, antivirus, network controls, and access lists to form a defense in depth architecture. At worst, a strong focus on traditional risk can cause misspending on silver bullets: expensive security solutions that generally do much less than they promise, sometimes because the environment is not mature enough to make the most of the investment. Except for the silver bullet, passive defenses are all necessary in credible cyber defense, but they overlook large areas that organizations should also address when considering cyber defense.
Figure 1.1 shows a risk treatment approach to threats that is often used in cybersecurity. Where a threat is identified, it is usually translated into risk, and then the risk treatment process defines whether a vulnerability exists and what the extent of it is (sometimes called the attack surface). Several controls look at how to reduce exposure, how to mitigate it (for example, by timely patching), and arrive at a residual risk that can be put on the heat map, or further reduced:
This approach to threats focuses on passive defense. Thereby it misses out on important additional components of cybersecurity defense. Specifically, it misses out on what organizations may do (and, in my view, should do) in the areas of architecture, passive defense, active defense, intelligence, and perhaps even offense. These together make up the five types of cyber defense, which we discuss next.
Five types of cyber defense
As Rob Lee points out in The Sliding Scale of Cyber Security (2015) (https://www.sans.org/reading-room/whitepapers/ActiveDefense/sliding-scale-cyber-security-36240), passive defense is only one of the five available modes of defense that organizations should consider when designing a cyber risk program. The five options sit on a spectrum, ranging from architecture, through passive defense to active defense, intelligence, and offense.
This spectrum can be read as follows:
- Architecture focuses on the design of systems so that they are as secure as possible. As part of architecture, we consider possible threats to the system and how the system can be made resilient against those threats. One of the most important aspects of architecture is threat modeling. We will discuss architecture in Chapter 5, Defensible Architecture.
- Passive defense focuses on the defense in depth and control framework that implements several systems (such as firewalls). These systems are added as preventive capabilities to the architecture to ensure that the system is robust against common attacks without constant human intervention. Packet Filters, for instance, allow traffic on ports and/or protocols only and will drop any packet that does not conform to its rules without human intervention.
- Active defense focuses specifically on threats and their contexts as they manifest themselves to us as defenders and is one of the key activities of agile security operations. Active defenders pick up what passive defenses miss. Active defense builds and maintains context and focus on active threats, based on a superior understanding of the environment. We will return to active defense in Chapter 6, Active Defense.
- Intelligence is the knowledge that an organization has about the tactics, techniques, and procedures of its adversaries. We will return to intelligence in Chapter 10, Implementing Agile Threat Intelligence.
- Offense focuses on the legal actions that a defender can take to disrupt or degrade an attacker's infrastructure. This may, for instance, include takedown actions where an attacker's infrastructure is removed from the internet by an authorized body.
Figure 1.2 gives a representation of the five defense modes and the respective focus of risk-driven and operations-driven security programs. Well-managed operationally driven programs will tend to expand to encompass the five modes of defense, whereas risk-driven programs will tend to focus on architecture and passive defense:
This book is written from the conviction that starting with security operations, security risk management can be done much better than is usually the case. An operationally driven program changes the conversation from driving down an externally defined program to a fact-based discussion on what happens in this business.
It is, from that perspective, surprising that many organizations that do have extensive security programs and policy frameworks are weak when it comes to security operations.
The security 1%
In an interesting blog post (https://taosecurity.blogspot.com/2020/10/security-and-one-percent-thought.html), Richard Bejtlich points out that the people having somewhat credible detection and response capability form part of the security 1%. He focuses on membership in first.org and then performs a quick estimate of the percentage of organizations that would be able to mount a credible defense when they are attacked. The conclusion is that only around 1% of organizations would have detection and response capabilities and are not running just planning and resistance/prevention functions. While this is a back-of-the-envelope calculation, it does underscore the need to improve security operations across the board. The problem of the security 1% also leads to several other problems, especially the question of whether advanced penetration testing tools and IOCs should be made as widely available as they are: they are nearly useless to the security 99% but may lead to improvements in the capability of attackers, making the overall security situation worse.
The focus on security operations does not mean that governance, risk, and compliance are unimportant. The main takeaway from this section is that the focus on security operations as a central activity alters the point where organizations should start first: governance, risk, and compliance is not a strong starting point for a security program in the initial stage – it is better to focus on developing operations that inform the governance program, and develop the governance, risk, and compliance program from what the security operations discover.
All the preceding points hinge on the assumption that an operations-driven security program is managed well. In Chapter 7, How Secure Are You? – Measuring Security Posture, we will return to the topic of governance, risk, and compliance in detail, and outline how a well-managed program can base itself on its operations.
A security incident is what most organizations hope will never happen. In agile security operations, incidents are the lifeblood of defense. During incidents, attackers reveal important information about their capabilities, intentions, methods, and tools, thereby turning a threat into reality. Good defenders will take advantage of the opportunity they are offered in this way to learn more about threats and improve their operations once the incident is over.
But to do this effectively, we need to be crystal clear about the intent and mode of incident response that organizations need to deploy. Learning from an attack is not useful if an organization doesn't survive the attack.
- Minimize attacker dwell time to the point where attackers are incapable of achieving their objectives
- Limit lateral movement of attackers on the network (for example, through defensible architecture)
- Prevent re-entry into the network after closure of an incident (evict successfully)
- Understand attackers' motivation and capability
The first aim of cyber defense is to ensure that an attacker – any attacker – will not achieve their objective and will be forced to leave before they achieve what they came for. This is quite an important point to understand: contrary to common opinion, the aim of cyber defense is not to prevent any attack at all costs, it is to prevent the adverse consequences resulting from an attack. Smart or experienced (or both) defenders know that attacks cannot be prevented, but they can only be dealt with once they occur.
Dwell time – the time attackers get to spend on our networks before they are discovered – is usually measured in months for the most advanced attacks. This really means that defense teams must improve their visibility and opportunities to detect the presence of attackers.
The second aim is to limit lateral movement of attackers or slow them down. The first point of compromise is rarely the end goal of an attacker, and attackers will need to pivot – or move laterally – to the point where they want to be. A hardened architecture with identity, data, and network segmentation will make it harder for attackers to do so and provide more opportunities to discover an attacker before they do their damage.
The third aim is to evict successfully and prevent re-entry. This speaks to how the activities should be sequenced: if an attacker entered the network through a particular vulnerability or backdoor, make sure that this issue is fixed before an attacker is removed. Also, many attackers set up a series of re-entry points and backdoors, so sometimes it is better to observe an attacker for a while to determine what they are and then evict them once all backdoors are discovered and can be closed.
The last aim is to discover as much as possible about an attacker while all this is going on. Also, store this information alongside any artifacts, somewhere securely. With many attacks going on, it is easy to forget important details and it is sometimes handy to have them at hand once the same attacker comes knocking again.
The Q model
Thomas Rid and Ben Buchanan developed a model for the attribution of cyber incidents that also indicates some of the key problems with incident response (Journal of Strategic Studies, Vol. 38, 2015, pp. 4-37, https://www.tandfonline.com/doi/abs/10.1080/01402390.2014.977382; a copy is also available on the author's personal website https://ridt.co/d/rid-buchanan-attributing-cyber-attacks.pdf).
The idea is that attribution, like incident response, takes place on a strategic, operational, and tactical/technical layer, and focuses on the concept, the practice, and the communication/reporting.
A detailed diagram of the Q model can be found in the supplemental material on the publisher's website: https://ndownloader.figstatic.com/files/1860725.
Security solutions in search of a problem
Before we really go into the nitty-gritty of security operations, I need to make one more point. A trivial one. Technological silver bullets don't exist. The security field is rife with solutions that pretend to be able to solve most of an organization's security problems (that is, address its risk) in a single stroke of technology (it should come as no surprise that this never works).
Organizations that fall for the seductive sales pitches of the silver bullets are getting less protection from their security investments than they think they are, misunderstand their real risks, and are likely to underinvest in security capability. A large reason for the failure of advanced tooling in immature businesses is that advanced tooling is seen as a silver bullet, is not understood in context, and lacks much of the data it needs to be effective. Even if the solutions themselves work as advertised, the implementation may fail primarily due to three reasons:
- They fail to understand and appreciate the context in which these security solutions work and fail to consider whether the right conditions for these solutions are in place.
- They fail to consider whether they can feed these solutions with the right data at the right time.
- They do not consider the impact on operations. Sometimes security technology needs a lot of fine-tuning by people who understand the context and do not work out of the box.
Robust security operations play a significant role in avoiding such a misspend, since it is only through security operations that organizations can understand the context in which advanced tooling functions best, the value it can provide, and the data and visibility it needs to be effective.
The scope of security operations
It is a mistake to think that the scope of security operations is limited to information technology, or wherever there is a computer or network. This is a leftover of a time when security operations were centered around network intrusion detection and malware operations.
These days, common exploits such as business email compromise are very common and successful. Business email compromise does not involve a technical intrusion on the network but instead exploits a business process. It involves sending an email to a person in an organization, pretending to be someone else, and then asking for money to be transferred for some reason.
The focus of this book will be how to do security operations well. Security operations done well focus as heavily on the context of security as they do on the technology. This means understanding the business and its operations as well as security technology.
Where security operations turn agile
We can understand this better by considering the agile manifesto (https://agilemanifesto.org/) and recasting it in the context of security operations. The agile manifesto has four tenets:
- Individuals and interactions over processes and tools
- Working software over comprehensive documentation
- Customer collaboration over contract negotiation
- Responding to change over following a plan
We have stressed multiple times that security operations focus on context, adversity of events, and use processes as tools. In this sense, security operations as we will develop them in this book are agile – they put individuals and interactions first.
Agile incident response
During incidents, it may be necessary to develop methods that work in the detection and deterrence of a specific attacker. During incidents, there is often no time to develop extensive documentation – although things do need to be written down and communicated. How security teams communicate during incidents is often like how agile software development teams communicate – regular standups, rapid interaction, and teamwork in dealing with a volatile situation.
Collaboration with the business and external parties may be crucial during the response phase of an incident. It is often advisable to have the right agreements in place with key partners before an incident, but an incident is not the time to consider extensive agreements and negotiations.
It is important that security teams share aggressively. There are several protocols that are widely accepted and used for sharing incident and attack data with outside parties.
The most common one, mostly used in enterprise security operations outside the intelligence community, is the traffic light protocol (https://www.first.org/tlp/). The traffic light protocol has, as you would expect, the colors red, amber, and green as well as white. The definitions of the colors are given in the web page. In practice, the meanings of the colors are as follows (although they sometimes change):
- TLP: Red: May not be shared with anyone outside the current conversation without the explicit permission of the person who shared the data. This permission must be asked for and given.
- TLP: Amber: May be shared on a need-to-know basis with members of the participants' organizations.
- TLP: Green: May be shared on a need-to-know basis within the community.
- TLP: White: Public disclosure subject to copyright.
Incident handling is primarily a process of anticipating change, responding to change, and having the ability to counter attackers' moves. In this sense, incident response and security operations differ in an important respect from software development: in software development, there is no party on the other side that actively tries to counter our moves and stay invisible, and who means us harm.
Security incident response often takes place under conditions of volatility, uncertainty, complexity, and ambiguity, summarized by the acronym VUCA. Agile stresses transparency and open communications, as well as accountability in its processes. This makes it suitable for dealing with incident response situations as well. The details of how that happens will be discussed in Chapter 2, Incident Response – A Key Capability in Security Operations.
Agile security operations
There are also good reasons for security operations to adopt an agile framework outside of the immediacy of incident response. Many of them will become clear later in this book. In this section, I will contrast agile, or capability, which I prefer, with the more common approach of using maturity models to guide security development.
Maturity models tend to focus on processes and tools and classify operations in categories such as ad hoc, basic, managed, and optimized. The problem is that the focus is on processes and tools, rather than people and capability. This is the wrong focus for security. Teams should focus on capability rather than maturity for the following reasons.
Security, both attack and defense, moves faster than other technology areas. This is particularly the case during incident response, but even with this pressure removed, developments in security are very rapid. A maturity model will have trouble dealing with the rapid evolution, since it is the model itself that must change to account for change, not the team or the practices.
Traditional risk management frameworks, which maturity models evolved from, are a poor fit for cybersecurity. This not only means that the traditional risk management methods have a propensity to lead us to the wrong solution, but also to the wrong sort of security operations. We have already seen that traditional risk models tend to leave out critical components of cyber defense; this is also true at a team and operations level. Maturity models focus on processes and repeatable playbooks and treat a security operations team as they treat a helpdesk. Many security operations teams have a Tier 1, Tier 2, and Tier 3 approach to security that is a poor measure of true capability and may require hand-off documentation in a time of crisis as an incident moves through its levels.
Incidents provide a rapid feedback, learning, and adaptation environment in which capability trumps maturity all the time.
Many small to mid-size organizations have a need to develop and deploy a credible cyber defense capability. A strong focus on maturity leaves out the necessary focus on what is necessary for them, and instead leads them to think that developing a credible cyber defense capability is an impossible task. It is not.
Lastly, there is the human factor. Maturity models lead us to a focus on processes, documentation, and repeatability. Security already has a workforce problem. The prospect of becoming a cog in a wheel will not be very attractive to the people who can have the most impact in this field.
In this chapter, we have introduced several concepts and reasons why security operations are central to a security program and inform and enhance both the strategy and risk management of a cybersecurity program by providing real facts on threats, context, effectiveness of controls, and effective policies.
Security operations, moreover, have a natural tendency toward agility, and many aspects of the agile methodology also apply in security teams. Even stronger, it could be argued that the best security teams have already worked in an agile way before the term became popular.
This book will not advocate dropping all regimentation, reporting, and discipline from security operations. It will argue that there is a better way. Security operations do need strong discipline and regimentation, and there is even room for processes and repeatability, but the stakes are too high to settle for suboptimal.
In the next chapter, we will focus more extensively on incident response as a core practice of agile security operations.