Illustration of multiple computing clouds

How to Manage Data Collection Across AWS Accounts with Stream

Last edited: July 27, 2021

Here at Cribl, we use several AWS accounts for our organization to accomplish different tasks. For example, accounts for labs and testing, accounts for software development, and accounts for our newly launched Cribl.Cloud service. If you’re looking for a guide on how to manage data collection across AWS accounts, keep reading as we break down the steps you need to take.

While we built our account strategy from the ground up using the AWS Control Tower framework (see our Logging in a Multi-Account AWS Environment blog post), some organizations that we’ve worked with use a legacy account structure that doesn’t consolidate logs to a single account. This creates a challenge in collecting data or logs from peer accounts, as permission boundaries now need to be crossed. Whether you need to collect or write data on another account, the AssumeRole permission allows for cross-account access without the need to generate static IAM keys. Note this permission as you begin setting up your systems to manage data collection across AWS accounts.

A usage example is a central logging account with access to other organization accounts to consume logs and not perform other actions. Let’s say I have two AWS accounts, A and B. I want Account A to be able to access resources in Account B. Therefore, I can build a policy inside Account B that allows permissions to access the target resources. I can then specify, in Account B, that I trust Account A to be allowed to use this role. The diagram below illustrates how the AssumeRole action permits the Trusted Account (A) access to resources in the Trusting Account (B).

Here’s how LogStream would work with AssumeRole permissions inside AWS to facilitate managing data collection across AWS accounts:

The LogStream Worker has an EC2 instance role attached.
The IAM role in Account A permits the EC2 instance to assume the role in Account B (and Account B trusts Account A).
Temporary IAM credentials are returned to the EC2 instance.
LogStream uses the temporary IAM credentials to access the resources in Account B.

Configure IAM AssumeRole Permissions for managing data collecting across AWS accounts

In Account A, we build a policy that allows only the ability to assume the role inside Account B. This policy restricts users from being able to access any resources that they don’t need to see. We can also revoke this trust relationship at any time without having to worry about an account still having keys and (therefore) access to the data.

In our example, we want to access VPC Flow logs inside an S3 bucket in Account B (ID 222222222222) from Account A (ID 111111111111). We’ll start building the two policies in Account B, and then move to Account A.

Account B Configuration

To start, we build the policy inside Account B to be able to access the S3 bucket (vpc-flow-logs-for-cribl) with the least privileges required for LogStream. This policy is the one that changes, depending on what you need to accomplish (e.g., reading from an S3 bucket, writing to Kinesis Streams, etc.).

Code example

{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Effect": "Allow",
     "Action": "s3:GetObject",
     "Resource": "arn:aws:s3:::vpc-flow-logs-for-cribl/*"
   },
   {
     "Effect": "Allow",
     "Action": "s3:ListBucket",
     "Resource": "arn:aws:s3:::vpc-flow-logs-for-cribl"
   }
 ]
}

We need to attach a Trust Relationship to Account B’s IAM role to permit the AssumeRole action from Account A:

Code example

{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Effect": "Allow",
     "Principal": {
       "AWS": "arn:aws:iam::111111111111:role/account-a-logstream-assumerole-role"
     },
     "Action": "sts:AssumeRole",
     "Condition": {
       "StringEquals": {
         "sts:ExternalId": "cribl-s3cre3t"
       }
     }
   }
 ]
}

Why an External ID Condition in the Trust Policy?

It is important to configure an AWS External ID, especially if you have third parties accessing your AWS accounts. The External ID protects from the confused deputy problem, where a third party obtains access through an intermediary. The External ID is not a password or secret, but it should still be protected from accidental sharing.

Account A Configuration

In Account A, configure a new IAM role with the following policy. For our example, we only want the role to be able to use the AssumeRole action, but you can add additional statements to meet your needs:

Code example

{
 "Version": "2012-10-17",
 "Statement": {
   "Effect": "Allow",
   "Action": "sts:AssumeRole",
   "Resource": "arn:aws:iam::222222222222:role/account-b-logstream-role-to-assume"
 }
}

Since we need our EC2 instance to be able to assume this role, we will configure the Trust Relationship as follows:

Code example

{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Effect": "Allow",
     "Principal": {
       "Service": "ec2.amazonaws.com"
     },
     "Action": "sts:AssumeRole"
   }
 ]
}

Configure LogStream to Manage Data Collection Across AWS Accounts

Now that our AssumeRole policies have been built, we can configure LogStream to assume the correct role to access the resources we need. Configure your Source, Collector, or Destination with the appropriate AssumeRole ARN and External ID. While the screenshot below shows an S3 collector specifically, all AWS sources and destinations support Assume Role functionality.

collector_configuration_with_assume_role

And there you have it! Now you can manage data collection across AWS accounts in a secure manner. To get started with LogStream today, you can sign up here. For more info, you can visit our documentation on cross-account data collection with AWS.

Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.

We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.

Previous articleNext article