Cloud Security Automation

By Prashant Priyam
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Introduction to Cloud Security

About this book

Security issues are still a major concern for all IT organizations. For many enterprises, the move to cloud computing has raised concerns for security, but when applications are architected with focus on security, cloud platforms can be made just as secure as on-premises platforms. Cloud instances can be kept secure by employing security automation that helps make your data meet your organization's security policy.

This book starts with the basics of why cloud security is important and how automation can be the most effective way of controlling cloud security. You will then delve deeper into the AWS cloud environment and its security services by dealing with security functions such as Identity and Access Management and will also learn how these services can be automated. Moving forward, you will come across aspects such as cloud storage and data security, automating cloud deployments, and so on. Then, you'll work with OpenStack security modules and learn how private cloud security functions can be automated for better time- and cost-effectiveness. Toward the end of the book, you will gain an understanding of the security compliance requirements for your Cloud.

By the end of this book, you will have hands-on experience of automating your cloud security and governance.

Publication date:
March 2018
Publisher
Packt
Pages
334
ISBN
9781788627863

 

Chapter 1. Introduction to Cloud Security

In this chapter, we will learn the basics of the cloud and cloud security.

To understand cloud security, we first need to understand the types of clouds and their architecture. There are many global standard organizations such as NIST and CSA who are defining the different aspects of clouds.

As per the NIST definition of cloud computing (https://www.nist.gov/programs-projects/nist-cloud-computing-program-nccp), in simple terms we can say that the cloud offers on-demand accessibility of computing, networking, storage, databases, and applications. Here you do not have to worry about the purchase of physical boxes (CapEx) to run your application. You pay the cost of the resource based on the usage (OpEx).

Note

CapEx stands for capital expenditures and denotes one-time cost investment to purchase any resource, for example, the purchase of a physical server.

OpEx stands for operating expenditures and denotes expenses to make the resource operational, for example, the purchase of manpower, power, OS, and so on to make the server operational.

We can say that the cloud provides abstraction of the underlying infrastructure using hypervisors and orchestrates to allocate resources to multiple tenants from the aggregated resource pool on demand.

The cloud offers multitenancy, agility, scalability, availability, and security. Let's understand what all these terms mean.

We will cover the following topics in this chapter:

  • Types of cloud
  • Cloud security
  • Shared responsibility model
  • Key concern areas of cloud security
 

Types of cloud


There are different models of the cloud. We broadly categorize them on the basis of deployment and service.

If we look at the cloud from a deployment perspective, there are three models.

Public cloud

This model of cloud is open to the public. This means that anyone can sign up and subscribe to set up their infrastructure to host their solution. For example, we have AWS, Microsoft Azure, Google Cloud Platform , IBM Cloud (SoftLayer), Alibaba Cloud, and so on.

Private cloud

This model of cloud is specific to an organization that wants to run their workload in a self-provisioned, secure way, internal to the organization. Organizations deploy private clouds using OpenStack, Apache CloudStack, Eucalyptus, OpenNebula, and so on as orchestration, and for hypervisors they are using VMware ESXi, XenServer, Hyper-V, KVM, and so on. 

Hybrid cloud

This model of cloud combines the features of both private and public cloud, or you can say it integrates the public cloud and the on-premise hosted cloud. For example, suppose we have an internally deployed OpenStack cloud platform and now we want it to integrate with any of the public clouds. For this, there are multiple tools available that enable you to integrate both clouds and also facilitate you to lift and shift the workload to and fro. Recently, Cisco came up with a product called Cisco CloudCenter (formerly known as CliQr) providing the same facility.

On the basis of service, we categorize clouds into three parts, which we call the SPI model.

In the SPI model, S represents Software as a Service, P represents Platform as a Service, and I represents Infrastructure as a Service.

Software as a Service

In this model, an application running on the cloud is offered directly to the end consumer as a service. Being the end consumer, we subscribe the service and start using it. You do not have access to control and manage the infrastructure layer and platform. Here, you do not need to worry about the IT infrastructure, application, and security. In this model, the Software as a Service (SaaS) provider is responsible for managing the underlying infrastructure.

Platform as a Service

In this model, the cloud provider sets up a platform to develop your application or run your application. For example, AWS provides the relational database service (RDS) service, which is a DBMS service wherein you just need to subscribe the RDS service and dump your database and start using it. You need not worry about infrastructure, OS, and other operational stuff. Platform as a Service (PaaS) services can be accessed using the API too.

Infrastructure as a Service

IaaS stands for Infrastructure as a Service. In this model of cloud, you can subscribe to the complete infrastructure (networking, computing, and storage) that is required to run your application. Here, you will get the building blocks that you need to assemble to run your application as per your requirement. Suppose you want to run one web application that is developed in PHP and MySQL. To run this on the IaaS platform you need to subscribe to computing, networking, and storage. Now, you will configure each of them to run your application.

As we have now got a fair understanding of the cloud and cloud models, let's see the architecture so that we can correlate it when we start learning about the security aspects:

In the aforementioned architecture, we can see that the base layer of every cloud is a physical server, storage, and network. On top of it, we have installed the Virtualization Layer (hypervisor), which abstracts all the resources.

Before the hypervisor, we have the Orchestration Layer, which communicates with the Virtualization Layer and makes available resource chunks (computing, storage, and network) to be shared among the multiple tenants on demand.

The user logs in to the cloud dashboard to subscribe the resource and starts running their service or application on it.

One thing we can see here is that the Security layer starts from base and goes up until the top. This means that we need to focus on the security aspect at each layer (from the physical layer to the user layer).

 

Cloud security


Organizations started shifting their workload from on-premise physical infrastructures to the public cloud, or started to deploy their own private clouds to get all the benefits of the cloud. But security is a major concern, mostly in the case of the public cloud where we do not have control of the physical infrastructure. Also, there are compliance requirements that organizations need to match for example, ISO/IEC, NIST, FedRAMP, PCI DSS, HIPAA, and so on.

In this section, we will learn the basics of security and the security options available in the public cloud.

In any environment for security, we always start with the following models:

  • CIA 
  • AAA

The CIA model focuses on Confidentiality, Integrity, and Availability. We also call it the CIA triad:

Confidentiality

Confidentiality denotes protecting the data from unauthorized access. We cannot disclose our data to everyone if it is critical, such as the bank statement, which includes all our financial transactions. We must ensure that a bank statement is disclosed only to our banker or the accountant who works on it. Similarly, this applies for an organization as well. In an organization, we have multiple departments and different roles. Here we define the roles for each user to access the information. Alternatively, you can say we define rules to decide who will access what.

Integrity

Integrity means protecting the data from unauthorized modification. All the data that we have in our organization is valuable. If it's modified this could be a huge loss for the organization. Let's take a financial statement as an example, which includes all the financial information of the organization. If it is tampered with or modified it can cause huge business losses. We must ensure that the data we are storing and that is being accessed by users is secure enough to maintain its integrity.

Availability

Availability denotes that the information is available to authorized users. If it isn't available to authorized users, again it will incur a loss. Information has value if the right people can access it at the right time. Almost every week you can find news about high profile websites being taken down by distributed denial of service (DDoS) attacks. The main aim of a DDoS attack is not to allow users with any website access to the resources of the website. Such downtime can be very costly.

Now coming onto the next model, the AAA model focuses on Authentication, Authorization, and Auditing.

Authentication

Only authenticated users should be able to access the data. The focus is on user validation, or authenticity of users. User validation is done with the user login credentials, or access keys, which are matched with the user database, LDAP, Active Directory, or key stores.

Authorization

Once the user is authenticated, there must be some authority defined for the user to access the data so that he can perform the required action. Basically, here we define the access policies and roles for users. Users with read permission should not be able to modify the data.

Auditing

Auditing does the accounting of user activity on data. Here, we log all the activity of users including login time, session duration, and all the activities that they perform.

The cloud offers multiple options to implement the security, but it solely depends upon the security requirement and our expertise to implement it.

One thing we need to understand is that implementing security in the cloud is a bit different from what we have been practicing with on-premise or traditional infrastructures. 

We learned previously that the cloud runs on top of physical boxes, such as storage, servers, and networking equipment. All these resources are controlled by a layer of abstraction that we call a hypervisor or virtualization layer. Further, all the aggregated resources are being controlled and managed by an orchestration. Internally, all these layers communicate with each other through an application program interface (API). 

In the cloud we can also start implementing security with CIA and AAA models. Here, we have the following services in the cloud to implement CIA and AAA models:

  • Confidentiality: Here, we use Identity and Access Management (IAM) to define the resource accessibility permissions. In IAM, we can define users, groups, roles, and policies. IAM also helps to manage the API access using access keys and secret keys. Similarly, we have a keystone in OpenStack to define users, groups, roles, and policies. You can define policies to access the other OpenStack services and APIs. 
  • Integrity: For integrity, we have multiple types of encryption for the storage where data is stored and for data in transit we can use SSL. If we want to implement an additional layer of security, we can use the AWS CloudHSM module as well. In OpenStack, we have the following process to manage integrity. It starts from the bootstrap level and goes up until the file level. (We've studied this process in detail in the Private cloud section.)
  • Availability: For availability, AWS offers you many services that ensure the service availability at different layers. We have Route 53 as DNS, Elastic Load Balancing (ELB), and autoscaling. All three services ensure that you have the service available in case of DDoS attacks as well. Availability of the application or solution also depends on how it is designed and deployed. Let's take an example of a web application; here, in order to ensure maximum availability of the application, we should always design the solution in such a way that there is no single point of failure. For this, we must consider high availability (HA), autoscaling, and also try to decouple the resources (using message queuing) so that it can be fault tolerant as well.

Now, let's map the AAA model in the cloud:

  • Authentication: In AWS, we have IAM for user identity. Here, we can define users and groups. IAM enables you to define the password-based authentication and key-based authentication for users. Apart from this, for the application, we can use the AWS Directory Service and Cognito (token-based authentication) for user authentication. For console users, we can also enable multi-factor authentication (MFA) as well. In OpenStack, we have a keystone, which provides user and service authentication. Here, we also create users and groups and define password-based and key-based authentication for users. Apart from this, we can use SAML-based authentication, OAuth, and OpenID-based authentication.

Note

It's always advisable to access the public cloud service using access keys. Do not access the console using root account credentials, and also make sure that you have enabled MFA for user accounts.

  • Authorization: For authorization, AWS uses IAM roles and policies. There are many predefined user roles for different services. Apart from this, we can define custom roles. For example, suppose we have an EC2 instance, that needs to access the Simple Storage Service (S3) resource and CloudWatch Logs. Here we can also define a custom role of an EC2 instance which grants access to the S3 bucket and CloudWatch Logs and binds that role with the EC2 instances. There is no need to store the access keys in text format on EC2 instances, but you will find that people are not practicing this. In the same way, if you need to manage multiple or linked accounts then you should define the role for cross-account access and use it instead of logging in using the root account. Similarly, in OpenStack, we have a keystone where we can define different roles and policies for users and groups. For each service, there is an associated access policy file defined in JSON format.

Note

In OpenStack, it is always advisable to use TLS-based authentication and also define formal access control policies.

  • Auditing: AWS provides CloudTrail to log all the action or activity for AWS services. Apart from this, we have Virtual Private Cloud (VPC) logs and ELB logs, which can be stored in the S3 bucket or can be transferred to CloudWatch Logs. For analysis, we can use the Elasticsearch service or query it using Athena. For solution auditing, you can use AWS Config and Trusted Advisor. Now we are moving toward machine learning and Internet of Things (IoT). For this, AWS started a new service called Macie, which uses machine learning to identify the accessibility of data and services. In OpenStack, we have a telemetry service called Ceilometer to store and manage logs. Apart from this, you can use custom monitoring solutions called New Relic.
 

Shared responsibility model


In the cloud, we can define stakeholders in two categories:

  • Cloud provider: This is the one who provides cloud services
  • Cloud consumer: This is the one who consumes cloud services

In cloud security, compliance is defined on the shared responsibility model. Here, the cloud provider is responsible for managing security and compliance at the physical infrastructure level, hypervisor level, physical network level, storage level, and orchestration layer. 

The cloud consumer assumes responsibility for managing security and compliance from the virtual machine level to the application level.

In AWS it's a bit more clarified, and here the security responsibility model is defined in three different categories:

  • A shared responsibility model for infrastructure 
  • A shared responsibility model for container service 
  • A shared responsibility model for abstract service

Let's see the broad shared responsibility models of AWS:

In the shared responsibility model, the cloud provider (AWS) is responsible for managing security and compliance at the data center level or the physical infrastructure level, such as server, storage, and physical network.

The cloud consumers (end users) will be responsible for managing security and compliance from the guest OS level (security patches and updates); VPC security, such as configuration of security groups, network access control lists (ACLs); and other software configuration, as well as the integration of other services (for example, RDS, S3, Simple Queue Service (SQS), Simple Email Service (SES), and more).

In the shared security model, AWS is responsible for the security of infrastructure where all the AWS cloud service offering is running. Here, the infrastructure consists of all the hardware, software, and the physical perimeters.

The customer's responsibility is determined on the basis of the services they subscribe. For example, if the customer subscribed the EC2 instance, they need to ensure security of the guest OS and configuration. For S3, they need to define the ACL and roles. Similarly, for the RDS, they need to define passwords, security group policy, encryption, and backup policy.

For the customers to ensure the security at each level, there are many services that are already available, such as AWS Config, Trusted Advisor, IAM, X-Ray, and Macie, which helps to make your security work easier.

Now, let's look at the previously mentioned categories of the shared responsibility model.

Shared responsibility model for infrastructure 

In this model, AWS is responsible for securing its virtualization, server, storage, physical network, and data center. The customer who has subscribed to the infrastructure service is responsible for defining security from the guest OS, application level, virtual network (VPC) level, data level, and finally, the user access level.

Shared responsibility model for container service

In AWS, we have container-based services. In computing, we have ECS, for databases, we have RDS, and so on. Here, AWS security responsibility goes higher up to the guest OS and platform level. Similar to RDS, AWS is responsible for managing security from the physical level to the database application level. Customers are only responsible for defining security at a subnet level, security group, encryption and password policy, and IAM roles.

Shared responsibility model for abstract services

AWS also provides abstract services such as SQS, SES, Simple Notification Service (SNS), and S3. For all these services, AWS is responsible for the complete security of the physical layer, virtualization layer, network level, storage, OS, software, and so on. Users or consumers need to define only the user-level permission and encryption if it is applicable for the service.

Now, let's understand the shared responsibility model in the cloud from the service perspective.

In IaaS, the cloud provider is responsible for only managing the physical infrastructure and security at the physical level. Being a user, we are responsible for the following:

  • VM level security 
  • Application and data security
  • User management
  • Virtual network level security

In the case of IaaS, the API plays a significant role, as all the internal components talk to each other using the API via HTTP methods: GET, PUT, and DELETE. The API enables cloud consumers to access the service using the REST API (available in all the clouds). We will look at the use of APIs in the automation section and also learn about how automation uses APIs to speed up deployment and enhance security.

In the cloud, we have multiple options to apply security on all the aforementioned levels but it completely depends on us as to how we are utilizing it.

In the PaaS model, the cloud provider responsibility increases; it is responsible for managing the platform too. Here, the platform denotes the environment on which our application will run. For example, most of the cloud providers have Database as a Service. Here, the cloud provider is responsible for managing the physical infrastructure, compute, and OS level security. Being a user, we will focus on user management, virtual networks, and data security. 

In the SaaS model, the cloud provider is responsible for providing end-to-end security until the application levels. We are only responsible for ensuring user management and data security. In AWS, there is no SaaS service, which is a part of the AWS cloud offering, but there are many partners who provide SaaS services. AWS equips you to ensure maximum security for your SaaS offering as well.

 

Key concern areas of cloud security


There are many agencies who are continuously working to define the global standards and also publish their recommendations for cloud computing.

Cloud Security Alliance (CSA) is one of them. It continuously works to address the security options of the cloud. CSA has identified key areas where we need to focus on cloud security:

  • Infrastructure level
  • User access level
  • Storage and data level
  • Application access level
  • Network level
  • Logging and monitoring level

Infrastructure level

Infrastructure level security is of the utmost importance. In a public cloud, the physical infrastructure is the cloud provider's responsibility. But in a private cloud, we must ensure the security at the infrastructure level as well. In OpenStack, all the components are separate services and they communicate with each other via APIs. It's very complex to ensure security at each level.

In OpenStack, we have services such as keystone, nova, and neutron, which have dependencies on their underlying databases. Here, it is always advisable that each database has its unique access credentials. This will help when any particular component gets compromised as it will not affect the other components.

Hypervisor in OpenStack must be enabled with SELinux or AppArmor. Most of the time, people disable it during configuration, but it's not recommended as it gives you a virtual boundary to protect your VMs. Apart from this, all the security patches must be deployed on the hypervisor.

There should always be an isolation between networks responsible for management, guest, and storage traffic. It's always preferred to have a separate VLAN for internal users so that users with infected or compromised machines cannot affect the cloud infrastructure.

There must be use of internal and external firewalls with OpenStack to control external and internal traffic.

In OpenStack, each service communicates with each other on specific ports; so, on the firewall, only these ports should be open.

Note

Do not open all the ports for all the services.

You must watch the activity performed by users, such as successful versus failed logins, and unique transactional behavior, such as users trying to download all the images at once.

In AWS, to secure the infrastructure, you can use IAM, Trusted Advisor, and AWS Config. All these services help you identify the loopholes in the configuration. Enabling logs, monitoring, and alerts using CloudWatch helps you to strengthen security.

For the instance level, we must update the guest OS for updated security patches. VPC logs must be enabled to monitor network-level activity. Using custom alerts on the AWS service, you can proactively manage the security aspects. For example, we can create alerts on NIC of EC2 instances. If the same instance broadcasts traffic massively, we can easily identify the issue by going through the logs.

User access level 

In the cloud, it is critical to define users and user access. In this section, we define the users, groups, roles, and policies. Users are entities who will access the cloud infrastructure using the console or APIs. A group defines the collection of users who will perform a similar set of actions. Roles define the nature of the job the user will perform, while a policy defines the rules for resource access. It also describes how the users will access the services or applications, and how one service will securely communicate with another service. In the public cloud, communication or integration of different services is usually the user's responsibility, where the consumer defines the secure way for communication. But most of the time, we make a mistake in this process and leave this part vulnerable to security breaches. 

For example, we have a solution where EC2 instances need to store the static files on S3 storage. In this case, ideally we should create an EC2 role that has permissions to access specific S3 buckets, but most people put the access key and secret key into the test file in EC2 instances, which is not recommended. This is because if the VM gets compromised, then the whole account is at a risk if the stored key has root account access keys.

Similarly, we must use MFA for console access and should not use the root account to access the console. However, in real life, most of the users do this—they access the console using the root credentials and they also do not use MFA.

For audit purposes, we must use IAM events and we should be logged in to CloudTrail.

In OpenStack we also have identity management to define user access. As in the case of the AWS service, here also we define users, groups, and roles. Identity management in OpenStack provides you with the Role-Based Access Control (RBAC) and ACLs.

Note

OpenStack identity management does not provide a method to control an unsuccessful login attempt. If a brute force attack happens, it won't be able to control it. So here, for prevention, you can use external authentication services, which can control the number of failed attempts to log in.

Storage and data level 

Storage and data level security is very important. Recently, we have heard about many cases of security breaches, such as Verizon, which suffered with a data leak on S3 due to it being publicly open. This also happened with Accenture, where the server was exposing the data to the public. These cases happened due to not implementing the security policy at the storage and data level. In the cloud, we have the following types of storage:

  • Volume storage: This type of storage is used as a block storage, which can be mapped with VM as a partition. To ensure security, we can use OS-based encryption or HSM to ensure the security of data. For data protection, we can define RAID as well. For example, in AWS we have Elastic Block Store (EBS), which provides an encryption facility and also provides the feature to create RAID.
  • Object storage: This type of storage is used to store static content, such as images and documents. Here, we can define encryption and ACLs to ensure the security of data. There are many cloud providers who already keep multiple copies of object storage data to ensure safety. For example, in AWS we have S3, which keeps six copies of data for redundancy.
  • Database storage: This is the type of storage that we use to store our database. In AWS, we have RDS. To ensure data security, we must ensure that encryption is enabled and also that only authorized users have access.

In general terms, we define data security in storage in two parts:

  • Data at rest: For data at rest security, we enable encryption using Key Management Service (KMS) or HSM. Here, we can enable encryption at the storage level. All the aforementioned examples of security for storage are for data at rest encryption.
  • Data in transit: For data in transit, we must define the secure channel to maintain the integrity of data. For this, we use SSL/TLS while communicating with the external service or users. From a management perspective, we always prefer to use a secure VPN tunnel.

Application access level

Application access is one of the most important areas of concern in terms of security. Here, we have our data and information in transit. We must secure this transferring data using a secure channel, such as SSL. Apart from this, if our application is a web application, we must ensure availability. We have heard about cases of DDoS attacks, SQL injections, and so on. There are always bad guys who work in the dark to steal your important data. To disable this, we must ensure that we have defined preventive parameters such as the use of the web application firewall (WAF), and that our infrastructure should be deployed in such a way that it can handle the DDoS attack. Security groups should allow the traffic on specific ports and from specific sources only. For example, we have a web application that runs with SSL on port 443, so make sure that only port 443 is open for public access. Network ACLs should also be configured to allow only legitimate traffic.

We can also use WAF to stop malicious traffic and prevent DDoS attacks. WAF also helps to apply rules on your websites for accessibility. You can also manage the traffic on the basis of geographical locations.

If your application uses a Content Delivery Network (CDN) to make your site perform faster, you must define security at the CDN level. The CDN keeps the local copy of all static content locally, which is transferred from one origin. So you must define security at the origin level and the CDN level regarding file access.

For APIs, security must ensure that the API is accessible only to authorized users with key-based authentication and the API should be accessible over SSL only.

Internet-based applications are more prone to DDoS and brute force attacks where there will be large amount of illegitimate traffic on your application, which results in the unavailability of your application. For online businesses, a DDoS can be critical, as the application's unavailability will essentially halt the revenue stream.

To tackle these situations, we can use a global DNS service such as Route 53, which can handle a traffic burst. The application must be deployed in HA with autoscaling running under the load balancer so that, if the peak comes, it should autoscale the resource to handle the traffic. 

There is also a chance that your VM gets compromised and starts broadcasting the packet. To eliminate this situation, we must do the security hardening of the virtual machine and enable monitoring so that, if any such adverse situation comes about, you will get an alert to take appropriate action.

Most of the time, we secure our environment externally, but what about the internal users? This case is very common in a private cloud or hybrid cloud environment. So, we must watch the user activity, the number of sessions, and the kind of transactions taking place. For this you can check the load balancer logs, application server logs, and user access, or you can use any monitoring tool that can display real-time logs in a meaningful way. Here we can utilize the Elasticsearch, Logstash, and Kibana (ELK) stack, which gives very interactive dashboards and graphs.

Network level

When we are moving to the cloud or opting for the cloud, network security is of the utmost importance. On the cloud, we can define the policy at our firewall level to allow and deny the traffic. In AWS, we use VPC to define the network. In VPC, we must create subnets to define the public, private, and management subnets. For SSH or RDP access, we must have either a jump server or bastion host. This will add one additional layer of security. The route table should be properly defined. We must define and configure network ACL to control the incoming and outgoing packets. In security, we only require the ports to be open and the source should be clearly specified. Do not open all the ports to the public.

For private subnet VMs, we can use the NAT service to enable internet access.

If you need to meet a specific compliance, you can use IPS and IDS to make the environment more secure.

To access resources from a management perspective, we should use a VPN connection. There are different types of VPN connections offered by AWS.

For a private and secure connection, we can use the Direct Connect connection between the customer site to AWS. 

In OpenStack, we must understand how the workflow process for the tenant instance creation needs to be mapped to security domains. There are a few services that directly communicate with neutron and these services must be mapped to security domains, as follows:

  • OpenStack dashboard: Public and management
  • OpenStack identity: Management
  • OpenStack compute node: Management and guest
  • OpenStack network node: Management, guest, and possibly public, depending upon the neutron plugin in use
  • SDN services node: Management, guest, and possibly public, depending upon the product used

To isolate sensitive data communication between neutron and other OpenStack core services, we configure communication channels to only allow communication over an isolated management network.

We must restrict the neutron API connection to a specific interface using specifying details in the neutron configuration file.

Likewise, we must define the incoming and outgoing traffic using security groups.

When using flat networking, we cannot assume that projects that share the same layer 2 network (or broadcast domain) are fully isolated from each other. These projects may be vulnerable to ARP spoofing, risking the possibility of man-in-the-middle attacks.

To prevent this, we must enable prevent_arp_spoofing in the Open vSwitch configuration file.

Logging and monitoring level

Logging and monitoring is a very important aspect of any IT infrastructure. Here we get granular details about all the events performed in the infrastructure at each level. Logging and monitoring is a bit complex in the cloud. In logs, we cannot always filter on the basis of IP due to dynamic allocation of IP. There can arise a situation where one IP was earlier representing the x virtual machine, but is now representing the y virtual machine.

Apart from this, the cloud comprises different services. We must ensure the activity logging at each service. 

In AWS, we can use CloudTrail to log all the activity for each service and we can either store these logs to an S3 bucket or we can forward them to CloudTrail logs.

Recently, CloudTrail logs enabled at the load balancer helped us to identify the illegitimate traffic. Let's consider, we are running one financial application in HA and an autoscaling environment. Over the last few days, we have seen a peak in resource utilization. As it's configured in autoscaling, it could not affect the application's performance. But, when we tried to investigate the issue, we found that there was a bad guy who was attacking our application:

2017-10-23T00:12:54.164535Z ASP-SaaS-Prod-ELB 90.63.223.128:46838 172.31.2.240:80 0.000038 0.001246 0.000057 404 404 0 0 "HEAD http://X.X.X.X:80/mysql/admin/ HTTP/1.1" "Mozilla/5.0 Jorgee" - -
2017-10-23T00:12:54.294395Z ASP-SaaS-Prod-ELB 90.63.223.128:46838 172.31.1.37:80 0.000069 0.000936 0.000051 404 404 0 0 "HEAD http://X.X.X.X:80/mysql/dbadmin/ HTTP/1.1" "Mozilla/5.0 Jorgee" - -
2017-10-23T00:12:54.423798Z ASP-SaaS-Prod-ELB 90.63.223.128:46838 172.31.2.240:80 0.000051 0.001275 0.000052 404 404 0 0 "HEAD http://X.X.X.X:80/mysql/sqlmanager/ HTTP/1.1" "Mozilla/5.0 Jorgee" - -
2017-10-23T00:12:54.553557Z ASP-SaaS-Prod-ELB 90.63.223.128:46838 172.31.1.37:80 0.000047 0.000982 0.000062 404 404 0 0 "HEAD http://X.X.X.X:80/mysql/mysqlmanager/ HTTP/1.1" "Mozilla/5.0 Jorgee" - -
2017-10-23T00:12:54.682829Z ASP-SaaS-Prod-ELB 90.63.223.128:46838 172.31.2.240:80 0.000076 0.00103 0.000065 404 404 0 0 "HEAD http://X.X.X.X:80/phpmyadmin/ HTTP/1.1" "Mozilla/5.0 Jorgee" - -

In the aforementioned logs, you can see how the bad guy is sitting on IP 90.63.223.128.

He is trying to hack the application using different URLs or passing different headers.

To prevent this, we enabled WAF and blocked all the traffic from the outside world. Also, you can make WAF learn about this malicious traffic so that whenever such a request comes, WAF will reject the packet. It won't let the packet pass through WAF. 

In the monitoring, you must define the metrics and alarm. It helps us to take preventive action. If anything goes against your expectation, you get an alarm and can take appropriate action to mitigate the risk:

Alarm Details:
- Name:                      awsrds-dspdb-CPU-Utilization
- Description:                
- State Change:              OK -> ALARM
- Reason for State Change:   Threshold Crossed: 1 datapoint [51.605 (24/10/17 07:02:00)] was greater than or equal to the threshold (50.0).
- Timestamp:                 Tuesday 24 October, 2017 07:07:55 UTC
- AWS Account:               XXXXXXX
Threshold:
- The alarm is in the ALARM state when the metric is GreaterThanOrEqualToThreshold 50.0 for 300 seconds.
Monitored Metric:
- MetricNamespace:           AWS/RDS
- MetricName:                CPUUtilization
- Dimensions:                [DBInstanceIdentifier = aspdb]
- Period:                    300 seconds
- Statistic:                 Average
- Unit:                      not specified

In the preceding example, we have defined an alarm on CPU utilization at the RDS level. We got this alert when there was CPU utilization of more than 50% but less than 70%. As soon as we got the alert, we started investigating, which caused the CPU utilization.

Now, let's see the summarized security risk and preventive action at different levels in the cloud:

  • Hypervisor level: In the cloud, we have our VMs running on shared resources. There could be a chance that there is a host, which runs x and y VMs. In case the x VM got compromised or hacked, there can be a risk of the y VM getting compromised as well. Luckily, it's not possible due to isolation of resources, but what if the attacker gets access to the host? So, we must update the required security patches on the hypervisor. We must ensure that all the security parameters are configured at the VM level. Most of the time, it happens when we disable the underlying security parameters. This happens mostly with the private cloud. At the hypervisor level, we also segregate the traffic at the vSwitch level where we must have at least management, guest, and storage traffic running on different VLANs.
  • Network level: The network is the backbone of the cloud. If the network is compromised, it can completely break down the cloud. The most common attacks on the network are DDoS, network eavesdropping, illegal invasion, and so on. To secure the network, we must define the following:
    • Isolation of traffic (management, storage, and guest)
    • ACL for network traffic
    • Ingress and egress rules must be clearly defined
    • IDS and IPS must be enabled to control the intrusion
    • Antivirus and antispam engines should be enabled to scan the packets
    • Network monitoring must be configured to track the traffic
  • Storage level: Storage is also a critical component of the cloud where we store our critical data. Here, we can have risk of data loss, data tampering, and data theft. At the storage level, we must ensure the following to maintain security and integrity of data:
    • All the data at rest must be encrypted
    • Backup must be provisioned 
    • If possible, enable data replication to mitigate the risk of hardware failure
    • User roles and data access policy must be defined 
    • A DLP mechanism should be enabled
    • All the data transaction should happen using encrypted channels
    • Access logs should be enabled 
  • VM level: At the VM level, we can have the risk of password compromise, virus infection, and exploited vulnerabilities. To mitigate this, we must ensure the following:
    • OS-level security patches must be deployed from time to time
    • Compromised VMs must be stopped instantly 
    • Backup should be provisioned using continuous data protection (CDP) or using a snapshot
    • Antivirus and antispam agents should be installed
    • User access should be clearly defined
    • If possible, define key-based authentication instead of passwords 
    • The OS must be hardened and the OS-level firewall and security rule should also be enabled
    • Logs management and monitoring must be enabled
  • User level: User identity and access is critical for every cloud. We must clearly define the users, groups, roles, and access policy. This is the basis of cloud security. This is the portion where we authorize them to play with the infrastructure and service. And, if the identity and access is not clearly defined, it can lead to a disaster at any time. To ensure security, we must define the following:
    • Users, groups, roles, and access policies
    • Enable MFA for user authentication
    • The password policy and access key must be defined
    • Make sure that the users are not accessing the cloud using the root account
    • Logs must be enabled for audit purposes
  • Application level: Once your application is hosted and open for public access, then actual risks arise to maintain the availability and accessibility of the service. Here, you will face DDoS, SQL injection, man-in-the-middle attack, cross-site scripting, and so on. To prevent all such attacks, we must use the following:
    • Scalable DNS 
    • Load balancer
    • Provision autoscaling 
    • SSL
    • WAF
    • User IAM policies and roles
  • Compliance: If you have to match some compliance, such as ISO 27001, PCI, and HIPAA, then you must follow the guidelines of all these compliances and design the solutions accordingly. We will read about compliances in the last chapter and learn how to meet them.

Note

While designing the solution, always think that you are designing for failure. Identify all the single points of failure and find appropriate solutions for them. Also, while designing the solution for the cloud, always consider security, reliability, performance, and cost efficiency, as these factors have a huge impact on your solution as well as organization.

 

Summary


Now that we have a fair idea about cloud computing concepts and the basics of cloud security, in the next chapter, we will study automation and discuss how it helps to implement security in the cloud.

 

About the Author

  • Prashant Priyam

    Prashant Priyam is an astute professional with a great deal of experience in cloud technologies, specifically requirement analysis, solution architecture, design, and delivery. He also has experience in cloud services and solutions, cloud consultancy and deployment, and data center services.

    Browse publications by this author
Cloud Security Automation
Unlock this book and the full library FREE for 7 days
Start now