The Cribl Stream Power Hour, or How I Reduced a Customer’s Splunk License $75,000 in One Hour

Written by Brendan Dalpe

May 14, 2021

The Cribl Stream Power Hour, or How I Reduced a Customer’s Splunk License $75,000 in One Hour

Recently, I had the opportunity to work with a customer who was looking to reduce their Splunk license cost. They were looking to expand their use of Splunk, but were constrained by the growth of their data volumes, and couldn’t spend more on top of their 500 GB license currently in use. Separately, they were also experiencing a growing pain: the cost associated with retaining the indexed logs for over a year, while only searching the last 24 hours of data! Really expensive way to store those logs… (Read our recent blog, “A Storage Unit for Observability Data” for more on this.)

After a demo, we quickly agreed to do a proof-of-concept of Cribl Stream, agreeing on their top use cases of reduction, routing, and enrichment of their Palo Alto firewall logs and Windows Security event logs. Setup was seamless, and took roughly 10 minutes to install the Stream software on the master node and then bootstrap the worker. Within another 5 minutes, we had configured the syslog source for the Palo Alto Firewall data, set up the three routes, and configured Splunk and S3 as two destinations for the logs. 

The first route was configured as an archival route for their Palo Alto log data. Every event was passed through unaltered to S3 for long term archival storage, separating the system of analysis from the system of retention. 

The second route took the Palo Alto logs and passed them through Stream’s out-of-the-box palo_alto_traffic pipeline. The magic of this pipeline resides in two functions:

  1. Dropping all log_subtype==’start’ events. (Palo Alto firewalls log two events for a connection: the start and the end. The start may not contain all information about a specific flow, whereas the log_subtype==’end’ provides that information.Drop log data palo alto
  2. Sampling events. The first filter below reduces events where the bytes field is 0, by selecting only 1 out of every 5 events. The second filter reduces events where traffic is allowed (from trusted to trusted zones) by selecting only 1 out of 10 events.

Filter log data palo alto


The third, and last, route was configured to process Windows events from multiple domain controllers. Stream also ships with a wineventlogs pipeline which has many useful functions to reduce the volume of Windows Event Logs, like trimming the event description!


reduce log data volume


The proof is in the results. While the Windows Event logs weren’t as dramatic, the graph below shows the results of the hour’s work: a reduction of the Palo Alto firewall logs from approximately 160 GB/day ingestion to roughly 60 GB/day. A 62.5% reduction!

The customer also benefited from applying Stream’s Auto Timestamp function to their Palo Alto firewall logs, which were variously configured to utilize four different time zones. Cleaning this up standardized all of their logs in UTC – meaning the timestamps were now correct when searching!

splunk log data reduction


Here’s the takeaway: If you’re reading this and your license size is under 1 TB/day ingestion, you could do this yourself for free! Yes, Cribl Stream is free for companies up to 1 TB/day for unlimited destinations. Go home a champion today. Try Stream for free and save 30% or more on your Splunk license costs.

Questions about our technology? We’d love to chat with you.