You Kids! Get Off My Lawn!

Season 13 GIF - Find & Share on GIPHY

At the risk of sounding all “back in my day,” I’ve been working with AWS services since probably about 2009, at first in testing and development, and later in many production environments. Back then, AWS recommended that companies should use multiple accounts for their environments, but the only real tool they provided to facilitate that was consolidated billing. Many companies have built their own tools to manage multiple accounts, to varying degrees of success. 

Over the years, AWS has introduced a number of services, like AWS Organizations, and SSO, that made incremental gains until they announced (during their 2018 Re:Invent show) a new service, Control Tower, that was intended to provide a comprehensive scaffolding for running multiple account environments. As with many AWS services, the first release was somewhat incomplete, but showed signs of being extremely useful in the future.

Building a New AWS Landing Zone

In March 2020, I joined Cribl after having run an Infrastructure team that was transforming to build and run workloads in AWS, hot on the heels of passing my AWS Certified Solutions Architect – Professional test. In startups, we all wear many hats, so I immediately became the “AWS Expert,” and set about revamping our AWS environment. I revisited Control Tower, and realized that it was ready for prime time, at least given our needs. 

If you’re not familiar with Control Tower, it starts out by creating a “Landing Zone” based on AWS best practices – it makes the account where you run it the “master,” sets that account at the “top” of an AWS Organization, and then creates a restricted organizational unit called “Core” and two new accounts within it:

  • Audit – this account is intended to be where Auditors live. They can use resources in this account to audit the other accounts, the organization, etc., without having to make significant changes in other accounts.
  • Log Archive – this account is intended to be the central repository of all logging in the environment.

In addition to creating accounts (and VPCs, via the Account Factory capability), Control Tower’s main selling points include the “guardrails” it creates in the accounts, in the form of AWS config rules and Service Control Policies (SCPs). These guardrails help keep the accounts in line with best practices, but some guardrails can prove challenging. 

The Challenges

At Cribl, one of the bigger challenges we’ve seen in the Control Tower environment is that the guardrails for the Log Archive account make it impossible to use it as a single repository of all logging information; it disables the ability to put bucket policies on S3 buckets within that account, instead relying on S3 ACLs (Access Control Lists). This is also a “permanent” guardrail, meaning that once it’s applied, AWS doesn’t allow you to un-apply it.

While S3 ACLs can do a decent job securing a bucket, they are not very flexible, making it hard to have “standard” policies that can cover your whole organization. For example, we wanted to have a single bucket that any LogStream instance in any one of our accounts could write to. This can be done easily with a bucket policy like this:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": "*",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": [
        "arn:aws:s3:::<bucket name>",
        "arn:aws:s3:::<bucket name>/*"
      ],
      "Condition": {
        "StringEquals": {
          "aws:PrincipalOrgID": "<organization id>"
        }
      }
    }
  ]
}

This policy grants the four S3 permissions (Get/Put/List Bucket and GetBucketLocation) on the bucket to any Principal that is a part of the organization specified in the Condition. Simple and “future proof,” in that as we add new accounts to the organization, they simply end up inheriting the permissions. I could not find (nor could AWS support) a way to accomplish the same thing with ACLs.

As another example, the docs for sending Elastic Load Balancer logs to S3 specify the need to have a bucket policy on the bucket that enables access to the local account, as well as AWS’ log delivery service. This really can *not* be done without bucket policies. 

So, instead of having a single account that received all of our logs, we decided to use two. The Log Archive account is used for all of the Control Tower–related logs (including CloudTrail). Then we created a “Core Services” OU, and a “Core Services” account that holds buckets that need to be more permissive (this is also where the majority of our IT services live).

At Cribl, we make pretty extensive use of S3 buckets in our monitoring approach. We use our own product, Cribl LogStream, to facilitate that use. We tend to minimize our use of CloudWatch for logging, instead having AWS services deliver logs to S3 buckets wherever we can. (We pull all CloudWatch logs into a central bucket, to allow us to retain very little in each account’s CloudWatch logs instance.) We use LogStream to read from S3, and we process/filter/enrich in-flight, archiving to a long-term archive bucket.

We have a pretty simple guideline for which account we’d use for a logging bucket: If we can provide the access needed via an S3 ACL alone, go ahead and put it in the Log Archive account. Otherwise, use the Core Services account. 

There is no perfect approach for managing large-scale, multiple-account AWS setups. Every organization will have different needs and constraints. The Control Tower service, especially when combined with AWS SSO for access management, is a powerful set of tools which will continue to improve over time. In future blog posts, I’ll cover some of the other areas where we’ve run into challenges, and how we’ve addressed them.

Resources

While I was setting up our environment, I found a number of great resources – some via AWS support, some via friends and colleagues, and some through brute-force Google searches. Here are a few that I found especially useful: