You’re seeing a mobile optimized view. To explore a walkthrough demo, please return on a desktop device.

Reducing Data Ingestion with Cribl

Optimize data ingestion and storage costs

Reduce and optimize data

IT and Security data often contains null values, duplicate fields, and low-value information. This is especially true of metrics, events, logs, and traces (MELT). By gaining control of this data, you can save money on licensing and infrastructure, and enhance the performance of your analytics tools.

In this demo, we’ll show you how Cribl Stream's powerful functions optimize your data workflows.
You'll learn how to:

  • remove duplicate fields

  • filter out null values and low-value events

  • dynamically sample data

  • convert log data into metrics for significant volume reduction

  • while retaining a complete copy for compliance.

What’s all that mean? It means you can get the right data to the right spot, in the right format, right when you need it. Right?

Ready to start?

New to Cribl? Click the 'Tell me more' button below to explore the Cribl suite of products. Discover how you can use all or select products to create your powerful Data Engine tailored for IT and Security.

Not so new? Or just impatient? Click 'Start Demo' and let's reduce some logs!

Simple data routing

In our instance we’re using a mix of Syslog and datagen sources to send security events, SNMP data, and Apache error logs. We’re routing optimized event streams to our SIEM via Webhook, while keeping a full-fidelity copy in Cribl Lake for compliance purposes.

A pipeline for your data

Take note of the pipelines in our configuration. Pipelines are a sequence of functions designed to reduce, transform, and enrich your data.

Let’s jump into this pipeline to explore how these functions optimize the data before it's sent to our SIEM.

Drop it like a bad habit

The drop function lets you filter and remove specific events from your data stream. In this case, we're using it to discard SNMP traps and low-severity Syslog events.

Limit sent fields

Next we use an Eval function to keep only a small subset of fields that we will use in our tools.

Now, let’s see what happened.

Review the results

Check out those numbers! That's a lot less data in our tools, saving licenses and making them perform better.

But what about our other pipeline? Let’s look.

Filter events

This pipeline has some flow logs. We are primarily concerned with getting any REJECT action events into our SIEM. So we are dropping all ACCEPTed internal addresses.

Create aggregations

Another way to reduce data is to create aggregations from it. You can set a time window and it will create the aggregations you specify, greatly reducing the number of events sent to your tools.

See the results!

How much reduction? Look at the results. Nearly 30% fewer events, with next to no data lost. Impressive!

Still with us? Let's look at one more option for reducing data.

Review null values

In our sample data, we can see that this event contains several fields with null values, along with a multiValueHeaders object that includes duplicative data in a slightly different format. None of these fields are necessary to retain in our tools.

Parser

Lets add a Parser function filtered to a specific source type. Parser allows you to parse event data into fields to make it easier to work with.

Filtering by the Sourcetype

We are only going to parse and transform events that have the field sourcetype=='lambda', isolating the events we are modifying in flight.

Reserialize

The Parser function allows you to reserialize the event after performing certain operations.

Before reserializing, we’ll remove any duplicate multiValueHeaders, their associated fields, and any null values to ensure the data is clean.

Review our optimization

Once saved, we can see the changes made in our sample. It’s another success!

All this data dropping feels great. But what happens if we ever need that data back? Lucky for us we have been storing that data in Cribl Lake, where we can use Cribl Search to explore it.

Searching the lake

There's our data, in its full, unadulterated glory. Shall we take a moment to revel in our success?

Hey, what's this?

Looks suspicious. We better get this into our SIEM for a deeper investigation.

But how?

Send data back through our pipeline

Don't worry, Cribl Search has us covered. With the Send operator we can send data back through Cribl Stream at any time. Let’s take advantage of all the routing, transforming, and enriching Stream provides.

Never be caught without the right data.

And just like that, our data has been sent back through Cribl Stream and into our SIEM destination.

The Data Engine to the rescue

What did we just see?

For one, it's very easy to reduce and optimize your data in Cribl Stream.

Second, we always know that the data is never truly lost with a Cribl Lake compliance copy and the power of Cribl Search.

It’s just another example of how the Data Engine for IT and Security can solve some of your biggest data problems.

Feel free to shedule a demo or try cribl by clicking on either.

Schedule a demoTry Cribl

See

Cribl

See a custom demo tailored to your tools and data challenges, with one of our team.

Try

Cribl

Get hands-on with a Sandbox or guided Cloud Trial.

Free

Cribl

Process up to 1TB/day, no license required.