The 8 Rules of Cloud Security
When moving infrastructure and applications to the cloud, it’s easy to want to abdicate responsibility of security to the cloud vendor. This is the wrong attitude and position to take, since we are ultimate responsible for the security of our systems and data. It’s also understandable though that when using public cloud computing we need to understand where our responsibility actually lies. Regardless of what cloud provider you use, there are some common things to consider. I call them the ‘8 Rules of Cloud Security’.
#1 – The Rule of Least Privilege
Only give users and applications enough access to do their job. This is a keystone of any good information security program, and it’s never been as important as when working in the cloud. Consider the role base access control (RBAC) system the cloud provider offers, and understand the permissions granted with the built-in roles for their service.
When considering access control, don’t just think about the users. Also consider Groups and Service Accounts (ie: Service Principals) that can also have roles and/or permissions applied. This is extremely useful if you are granting access to different services across your cloud systems. Think about the cascading affects from applying RBAC at the subscription level, and how that trickles down into the resource groups and ultimately to the individual resources. If your cloud provider offers a way to test exactly what permissions a user or service principal will have on an end resource, consider reviewing that to ensure higher level RBAC controls are not altering the way you THINK access is being granted.
#2 – The Rule of Change Management
One of the biggest concerns with cloud computing is the fact it is so easy to change the configuration of a resource. This can be done through the vendor’s cloud portal, through the command line or through programmatic SDKs and APIs. As more businesses move towards Infrastructure as Code (IaC) and th desire to let devops have as much power as IT admins it is difficult to maintain control of the configuration. Sometimes called ‘configuration drift’, it’s important you get a handle on this and document the changes as they occur.
If the cloud vendor offers the ability to track changes, turn it on. It usually isn’t on by default, and it really should be. If they do NOT offer this option in your subscription plan, look to see if they offer a change feed or support event triggers for configuration changes. If they do, use those to trigger change notifications and use serverless computing like Azure Functions or AWS Lambda to push change log information to a nonrepudiable long term logging system. Make sure you record the time, activity and the identity of the person or process than has made the change. If possible, collect the delta change of the metadata so you can reconstruct resources if necessary if you need to revert a change.
#3 – The Rule of Trust
You must understand the implications of extending trust to anyone or anything within an organization. The rule of least privilege should prevail. Although you may trust your system administrator today, what happens when he or she holds a grudge towards you tomorrow? Can they bring your organization down to its knees?
Consider reading the exploits of the BOFH for funny examples of just how many admins feel/think.
Although the BOFH is fictional, I can speak from experience when I say many of these tactics HAVE been tried in production environments by bored admins. Trust, but verify is no longer enough.
Although Alice may be your best employee, will she be next year? You never know. Almost 80% of all breaches happen WITHIN the organization, usually because someone is trusted in conditions that it isn’t needed. The internal threat is a significant one, and one you must address with the rule of trust.
Always require your cloud administrators to use strong authentication to prove their identity when logging in. With so much access provided as an administrator, it’s vital that they use two-factor authentication to ensure that a leaky credential stolen is not compromised to give far more access to the cloud infrastructure when not appropriate.
#4 – The Rule of Information Protection
Limit access to information to only those people and resources that absolutely need it. When possible limit access to the information resource to trusted sources only. Use the Rule of Least Privilege along with the Rule of Trust to ensure that this rule can be respected when accessing information. Some examples of technical safeguards that can assist in meeting this rule's objectives include using the operating system's access control system (ACLs, perms etc) inside VMs, applying file/folder/disk encryption and using network access controls (firewall rules, authentication, etc).
All data at rest should be encrypted. If you are using a managed database ensure you have enabled transparent data encryption (TDE). As it may not have been the default to turn this on when you signed up for the service, it may not be encrypted. Make sure it is.
All storage accounts should be encrypted. And the defaults for data transmission should also be encrypted. When possible force the use of TLS/SSL when accessing the data.
Finally, try to move all cryptographic keys, passwords and secrets stored in configuration data into safer key management systems. Services like Azure’s KeyVault, AWS Key Management Service and AWS Secrets Manager offer the ability to separate keys & secrets from configuration to better safeguard such information. It also helps with key escrow and rotation so you don’t have to fret about it as much. Use it.
#5 – The Rule of Separation & Isolation
Think in terms of zones. Zoning is the process in which you define and isolate different subjects and objects based on their unique security requirements. For those uninitiated to the terms, a "subject" is a person, place or thing gaining access. An "object" is the person, place or think the subject is gaining access to. I use the terms generically since when zoning you really could be applying it to anything. A file/blob, a VM, or a separate service or API. You have probably seen the concept of zoning in Internet Explorer where Microsoft breaks zones down into the Internet, Local Intranet, Trusted Sites and Restricted Sites. This is just one example of how you can break something into zones. Of course the concept of zoning can be applied anywhere, as long as each zone treats security in a different manner.
Although I have seen most people think of zones in a network-centric manner, it doesn't have to be. It could apply to applications, cloud access and even employee interactions with others as a defense against social engineering tactics.
Anyways, a zone is a grouping of resources that have a similar security profile. In other words, it has similar risks, trust levels, exposures and/or security needs. For example, an Internet facing web service will have a different trust and exposure level than an intranet web site. As such, the two should be in different zones. Though you can have umpteen different zones, typically the most common scenarios involve three zones:
The trusted (internal) zone
The semi-trusted (dmz) zone
The untrusted (external) zone
These three zones can apply to almost anything, from network based services, application programming and even datacenter access layouts.
The trick is separating zones in such a way so that we can maintain higher levels of security by protecting resources from zones of lesser security controls. The separation mechanism between zones could be as simple as a network access control list (ACL), a piece of managed code or a security guardrail against the data itself. The goal is to have some degree of control over what happens between the zones. And have logical communication medians to allow for zones to communicate safely where appropriate.
Theoretically it would be nice to live in isolation and never care about other zones. But in reality, at times some zones will need to be able to talk to others. If we didn't allow that, you wouldn't be allowed on the untrusted zone of the Internet from your trusted zone of your internal hosted network in the cloud. It would have to be severed. The trick is to understand the risks of exposure when communicating between zones, ensuring that some sort of filtering safeguard is working in between to determine what is, and more importantly what is NOT, allowed to communicate through the filter. As an example, there is a much higher level of risk in allowing a direct inbound connection from an untrusted zone to a trusted zone. This is why we have firewalls and network security groups on our perimeters. And the risks are significantly reduced if we place an untrusted inbound connection into a semi-trusted DMZ.
See how this all fits together? Zones give us the ability to reduce risk by applying technical safeguards in a logical manner through grouped resources. How we communicate between a trusted and semi-trusted zone would be different than an untrusted to trusted zone. And we can make better security decisions by understanding that.
I have been using a six-step process to apply the zoning concept into the decision-making process for infosec in the cloud. The following procedures can help in that process:
Identify any instance where an untrusted or less trusted object comes in contact with a trusted, valuable or more sensitive object.
Determine the direction of communication that is needed. Ask yourself "Is it possible to use an outbound communication model (trusted going to less trusted), or do I need to have the untrusted object initiate the communication". Where possible ALWAYS try to have the more trusted zone control the communication.
Determine where it would be possible to separate the trusted object into two components; one that handles sensitive information and the other that acts as a relay or middle entity in the transaction. This is why proxies can work so well in security.
Determine what forms of communication need to take place between zones and block everything else. Understand the different levels of risk exposure and determine if it’s necessary to perform the tasks. As an example, why use clear text telnet if you can use secure shell (SSH)?
Place as many security controls between each of the components as is reasonably possible, remembering what assets you are trying to protect. A $50,000 advanced network threat intelligence system subscription monitoring your cloud resources doesn't make a lot of sense to protect your $500 collection of Michael Bolton MP3s stored in blob storage.
Document the reasoning, supporting data and conclusions in this decision-making process. Keep this document for reference and to simplify the decision-making process for similar situations in the future.
#6 – The Rule of the Three-Fold-Process
Security is NOT just about technology implementation. Administrators love to install new fancy wiz bang things (which gets even easier in the cloud), but typically don't follow through the entire security management lifecycle. You must include implementation, monitoring AND maintenance to effectively safeguard your resources. You must understand what is being monitored and logged, know when something is wrong and know how to respond to it. You need to keep up to date with what is going on and what your overall security posture is at all times. If you don't, what good was implementing the resource's safeguards in the first place?
#7 – The Rule of Preventative Action
To effectively defend against the digital divide, you need to proactively assess the security in your environment. You need to keep aware of new security risks that are in the field, and follow the feeds and notifications your cloud vendors provide for you. Regularly test your defenses using vulnerability assessment tools before an attacker does. Compare the results with the prescriptive guidance like the Center of Internet Security (CIS) Foundation Benchmarks. Maintain a strong three-fold process and keep your systems up to date with the latest security patches.
#8 – The Rule of Immediate & Proper Response
Long before you are ever breached, you should have an Incidence Response plan put in place. It has been seen in the past, that when an organization responds poorly to an intrusion, they typically do more harm than the attacker did. A rational, well thought out response plan can make all the difference. You need to react quickly, document everything and above all STAY CALM. Ensure you have a very clear and widely known chain of command so that the issue can be reported quickly to the right people and get a rapid response. Be discrete (yelling "the sky is falling" is never productive) and follow your plan. Then quickly restore, and move on.
If you follow these eight rules you will be significantly more secure. Technology will fail. Your team will very likely misconfigure your cloud infrastructure in some way. Accept it. With proper policies and procedures in place though, you significantly reduce the impact that it may have on your organization. You will find that riddled through each of the above rules, a common theme exists.... if you only followed one rule, let it be The Rule of Least Privilege. Using least privilege significantly reduces the damage that may be caused when exposed to risk. It contains suspect behavior to the smallest set of actions and activities, and maintains the confidentiality, integrity and availability of the rest of the environment. And in the end... that's what we want to accomplish.
No matter if its on-premise or in the cloud, YOU are responsible for the security of your systems and data. Don't let anyone else tell you different.