Thou Shall Pass! Troubleshooting Common Amazon S3 Errors in Cribl Stream

By

Last edited: February 19, 2024

Explore common errors in AWS S3 data lakes with Cribl Stream, including potential causes and solutions to common errors.

Data lakes are everywhere! With data volumes increasing, cost-effective storage is becoming a greater need. With Cribl Stream, you can route data to an Amazon S3 data lake and replay or search that data at rest. But nothing is more frustrating than something not working and those blasted error logs that pop up. In this blog, some common errors for your S3 sources or destinations are highlighted, and some potential root causes and solutions are highlighted. This is not an exhaustive list but encompasses some of the more common issues you may encounter. That being said, each environment is different, so use these as a general guideline.

Authentication Options

You have two main authentication options for setting up S3 Sources/destinations. You can leverage Assume Role in which Cribl workers adopt an AWS role with permissions and policies attached. Alternatively, you can use an access key/secret key combination to authenticate (also with restrictions and policies).

You may use one over the other for various reasons, but primarily when trying to accomplish cross-account access between Cribl.Cloud to your AWS account, Assume Role is the preferred method. It allows you to gain access without creating temporary IAM keys. For anything not running in AWS (your on-premise and other cloud provider workers would fall into this category), the Access Key/Secret Key option is available to create a static set of user-associate IAM credentials for authentication.

For more information on cross-account access and configuration, visit this link.

Where to Look for a Problem

Whether you are troubleshooting an AWS S3 source or destination, you can start by navigating to the source or destination you suspect has an issue (Figure 4). If you are troubleshooting a collector, you will instead want to navigate to the Job Inspector (Figures 1-3) for the latest collector run (Monitoring > System > Job Inspector > Click on the relevant Job ID).

Within the source or destination pop-out, the “Logs” tab includes all errors/warnings/etc (Figure 5). Messages that you can search and the “Status” tab (Figure 6) give you a high-level view of errors at a worker level. Both will be handy in diagnosing your issue. Within the collector job pop-out, the “Logs” tab (Figure 2) is also relevant as well as the “Task Errors” tab (Figure 3), where you can drill deeper into the specific errors for the collection tasks at hand. A handful of screenshots below will highlight and display each of these pages.

Figure 1: Job inspector – Job stats page

Figure 2: Job inspector – Job logs page

Figure 3: Job inspector – Job Task errors page

Figure 4: S3 Destination – Configuration page

Figure 5: S3 Destination – Logs

Figure 6: S3 Destination – Status page

Embedded Log Hints

With the latest minor release (4.4), Cribl has integrated more hints into the S3 source and destinations to offer help while troubleshooting. When looking at the errors in the status tab and logs, you will now find a “hint” field that offers a bit more context around the error message you are receiving. See the example down below for a “Bucket does not exist” error and its corresponding hints. This now allows you to speed up your troubleshooting and focus on some of the more common fixes first.

Common Errors

Throughout our experience working with customers sending and receiving data from S3, we have compiled a list of common error messages you may receive from one of your S3 sources or destinations. Below is the list of these common errors, why they may be an issue, and what a potential resolution might be. Once again, this is not an exhaustive list of either errors or resolutions but it offers some guidance on a starting point. Your environment may differ, and you must incorporate any intricacies in your troubleshooting.