How to Augment an Existing Data Lake with Exabeam and Cribl Stream

Written by Thien Huynh

December 6, 2022

Organizations have different data lakes they use to search, whether it is Splunk, Qradar, or Sumo Logic just to name a few. Exabeam (UEBA Advanced Analytics) sits on top of those existing data lakes and pulls specific sources by running continuous queries every few minutes into Exabeam.

Exabeam and Cribl

The image below shows a Splunk query to pull windows event logs into Exabeam Advanced Analytics over the port (8089). The query is complex. It’s packed with Boolean operators filtering the right Windows Event codes, making it hard to read – and harder to maintain! But this is necessary work for Exabeam to build out its models.

Exabeam and Cribl

Let’s look at another example. Below is a Splunk query pulling CrowdStrike Falcon Data Replicator (FDR) logs into Exabeam to build models for first-time process execution or abnormal execution from user/asset might run.

At this point, you get the picture. More sources will require more queries with fine-grained filters. More queries means slower Splunk and, if you’re using workload-based pricing (SVC), much higher costs.

Challenge:

  1. You are running continuously, putting a lot of stress on your Splunk instance. The impact would be your Splunk engineers would have poor performance on new queries or timeouts.
  2. Splunk is your single point of failure. If Splunk goes down, Exabeam goes down, causing a huge blind spot for investigation and alerting.
  3. Since Exabeam runs these queries in batches every few minutes, the best visibility you can hope for is near real-time.

New Architecture:

With Cribl Stream, we can rearchitect how we feed Exabeam with two distinct routes. All of your data from the left side of the image will be pushed to Cribl Stream. Stream will route your data to multiple destinations Let’s see what that looks like.

Exabeam and Cribl

Left side sources show Splunk Universal forwarder and Syslog. All you would do is drag the dotted line to each destination, Splunk and Exabeam.

Exabeam and Cribl

Please read my previous blog which shows you how to transform your data to match Exabeam Parsers.

Solution:

  1. Cribl Stream make copies of events and route to multiple destinations. This will help mitigate risk if your data lake goes down Exabeam will still receive logs.
  2. Deploying stream is easy in any architecture cloud or on-prem or replacing intermediate application IE: Heavy forwarder.
  3. Streaming data from the source to achieve real-time detection rather than batches.

Wrap up

SIEM and XDR vendors have features to forward or pull events from their platforms, but that does not always mean their capabilities will scale to the terabytes or petabytes used for today’s cybersecurity work. And let’s face it: these products weren’t designed to move that much data. On the other hand, moving high-volume data is exactly what Cribl Stream is designed to do.

Coupling your SIEM and XDR solution with Cribl Stream takes load from your security tooling and allows them to do what they’re good at:  correlation, building threat models, detecting abnormal behavior, and speeding investigations. Keep the data pipeline and routing up to Cribl Stream.

If you want to explore SIEM and XDR optimization further, you can reach me in our Community Slack channel where I work with Exabeam and Cribl customers.

Questions about our technology? We’d love to chat with you.

So you're rockin' Internet Explorer!

Classic choice. Sadly, our website is designed for all modern supported browsers like Edge, Chrome, Firefox, and Safari

Got one of those handy?