LogStream Cloud provides a simple, secure way to manage globally distributed observability LEARN MORE

Scale Your Cribl Pipelines With This One Weird Trick

Sondra Russell
Written by Sondra Russell

September 23, 2021

Designing one pipeline for each source type is a great best practice, but as you expand your Cribl footprint, you may find that different source types are so similar that it would be useful to send them all through a single pipeline, and enable and disable certain functions conditionally.

In fact, you may find that you are working with so many different source types that it might be even nicer to manage which source types trigger which functions using a mechanism outside of Cribl Logstream completely — say, a lookup table, perhaps?

If that’s you, you’re in luck, because Cribl LogStream can reference lookup tables for more than just enrichment purposes. As we walk through this example, you’ll see how to leverage the Lookup function to create an internal field, and then use that field to filter subsequent functions.

Make the Lookup Table

This use case is designed to map index/sourcetype combinations to one or many masking functions in the given pipeline. First, let’s make the lookup table.

Translates to this:

  • Index/sourcetype combo “syslog, sourcetypeb” should have the masks “ssn” and “auth_token” applied.
  • Index/sourcetype combo “weblog, sourcetypea” should have the masks “auth_token” and “bearer_token” applied.

Note that in the lookup file:

You may only have ONE line for each index/sourcetype combo.

The third column of that line will be a pipe-delimited list of applicable masks.

In other words, as the lookup table is updated, users may NOT merely append new masks for existing index, sourcetype combinations. Instead, users must find the relevant index, sourcetype combination in the existing lookup rows, and modify only the third column.

Add a Lookup Function in the Pipeline

As a second step, you’ll add a Lookup function at the top of the pipeline. This function will identify the index/sourcetype combination for the given event, and then create a new internal field called “__mask”. The “__mask” field will be used in the filters for the subsequent events.

The screenshot below shows exactly how to configure the Lookup function, and the result you can expect.

Plug In the Filters

Finally, you can apply a filter to each subsequent function to determine which functions should apply to which matching events.

To translate the filter into English:

__masks.split(‘I’) = split the masks field into an array of values delimited by the pipe character.
indexOf(‘ssn’) > -1 = if the value ‘ssn’ is not one of the values in that array, the IndexOf value would return as -1. If it exists in that array, the value will be greater than -1.

Bonus Points

By leveraging the flexibility of Cribl LogStream’s Lookup function, we can create a flexible pipeline that scales with your organization, and makes it simple to add new source types without modifying the pipeline directly.

As a bonus, leveraging these filters improves the pipeline’s performance by enabling LogStream to bypass relatively expensive functions, like Mask and Encrypt, for events that don’t need them.

For more interesting and creative ways to improve the flexibility and performance of Cribl LogStream, consider joining us on the Cribl Community. If you want to try LogStream for yourself, launch a Sandbox in the cloud with ready-made data.

The fastest way to get started with Cribl LogStream is to sign-up at Cribl.Cloud. You can process up to 1 TB of throughput per day at no cost. Sign-up and start using LogStream within a few minutes.

Questions about our technology? We’d love to chat with you.