On Cybersecurity and Machine Learning
With the dawn of the Information Age, cybersecurity has become a pressing issue in today’s society and a skill that is much sought after in industry. Businesses, governments, and individual users are all at risk of security attacks and breaches. The fundamental goal of cybersecurity is to keep users and their data safe. Cybersecurity is a multi-faceted problem, ranging from highly technical domains (cryptography and network attacks) to user-facing domains (detecting hate speech or fraudulent credit card transactions). It helps to prevent sensitive information from being corrupted, avoid financial fraud and losses, and safeguard users and their devices from harmful actors.
A large part of cybersecurity analytics, investigations, and detections are now driven by machine learning (ML)and “smart” systems. Applying data science and ML to the security space presents a unique set of challenges: the lack of sufficiently labeled...
The basics of cybersecurity
Traditional principles of cybersecurity
Let us now examine each of these in depth.
Confidentiality can be achieved by encrypting data. Encryption is a process where plain-text data is coded into a ciphertext using an encryption key. The ciphertext is not human-readable; a corresponding decryption key is needed to decode the data. Encryption of information being sent over networks prevents attackers from reading the...
An overview of machine learning
In this section, we will present a brief overview of ML principles and techniques. The traditional computing paradigm defines an algorithm as having three elements: the input, an output, and a process that specifies how to derive the output from the input. For example, in a credit card detection system, a module to flag suspicious transactions may have transaction metadata (location, amount, type) as input and the flag (suspicious or not) as output. The process will define the rule to set the flag based on the input, as shown in Figure 1.2:
Figure 1.2 – Traditional input-process-output model for fraud detection
ML is a drastic change to the input-process-output philosophy. The traditional approach defined computing as deriving the output by applying the process to the input. In ML, we are given the input and output, and the task is to derive the process that connects the two.
Continuing our analogy of the credit...
Machine learning – cybersecurity versus other domains
- In sales and marketing, to identify the segment of customers likely to buy a particular product
- In online advertising, for click prediction and to display ads accordingly
- In climate and weather forecasting, to predict trends based on centuries of data
- In recommendation systems, to find the best items (movies, songs, posts, and people) relevant to a user
While every sector imaginable applies ML today, the nuances of it being applied to cybersecurity are different from other fields. In the following subsections, we will see some of the reasons why it is much more challenging to apply ML to the cybersecurity domain than to other domains such as sales or advertising.
Security problems often involve making crucial decisions that can impact money, resources, and even life. A fraud detection...
This introductory chapter provided a brief overview of cybersecurity and ML. We studied the fundamental goals of traditional cybersecurity and how those goals have now evolved to capture other tasks such as fake news, deep fakes, click spam, and fraud. User privacy, a topic of growing importance in the world, was also introduced. On the ML side, we covered the basics from the ground up: beginning with how ML differs from traditional computing and moving on to the methods, approaches, and common terms used in ML. Finally, we also highlighted the key differences in ML for cybersecurity that make it so much more challenging than other fields. The coming chapters will focus on applying these concepts to designing and implementing ML models for security issues. In the next chapter, we will discuss how to detect anomalies and network attacks using ML.