Over the past year, I’ve noticed some interesting trends in my work with state and local governments. Across my conversations with organizations in this space, there’s a common thread: teams are getting creative about maximizing their limited resources. With budgets either flat or shrinking and operational demands increasing, these teams face tough choices. They’re being asked to maintain or improve services while working with the same, or in some cases, fewer resources than before.
This budget trend forces teams to get creative and make tough decisions. Some organizations must choose between maintaining basic operational visibility and funding critical security tooling. Others are trying to make one tool serve multiple teams – not because it’s the best approach, but because it’s the only financially viable option. I’ve seen this play out in several ways in the field. One example is some teams using their firewall logs to do double duty for SOC detections and IT network troubleshooting. It makes sense, given both teams tend to need this data. But more and more, I’m seeing IT teams who can’t afford their own platform and wind up carving out space in the SOC’s SIEM for their priority use cases.
These resource constraints and compromises are creating real challenges:
There’s a smarter way to handle this – one that doesn’t involve compromising security or breaking the bank. Cribl’s platform helps solve these challenges by:
Let’s start with our log source, which, in this case, is a Fortigate Firewall log. As you can see, there is a wealth of information here that IT simply doesn’t care about or need.
The first thing we’re going to do is make this log easier to work with by parsing out all those lovely key value pairs nested in our Syslog message. Parser makes easy work of this. Change the Type to K=V Pairs
and hit save.
We now have discrete fields we can easily reference in our functions to come (parser created the fields highlights in green):
Instead of tracking destination port separately from IP, let’s concatenate them based on our available fields. We’ll use an Eval Function along with a simple javascript template literal to get it done:
Next up, we will discuss the MVP of our pipeline, Aggregations! In my opinion, this Stream function is underused; it’s so amazing for many different use cases. Creating insightful metrics, rolling up metrics, and summarizing are great ways to reduce volume and increase performance. In this case, we’ll do a couple of aggregations over a 10-second tumbling window: count()
and list()
.
This will give us a quick understanding of how many logs we’ve aggregated per new event, and the list will keep track of the new combined dst_ip_and_port field we created above. The other important setting here is our Group by fields. This tells our function what constitutes a unique aggregation across our source logs. We will aggregate based on action and the source IP in this case. This will allow us, for example, to chart and query based on denies from a particular user’s workstation or laptop.
And our resulting log!
One of the best parts of this approach was that we dropped 60% of our logs and shrunk the overall volume by 93%. Given Stream’s shared nothing architecture, the actual effective reduction is more likely somewhere around %65. Sprinkle in some compression when writing to Cribl Lake, and you gave IT a heap of visibility incredibly cheaply.
For reference, if this had been a 1TB a-day source, you’d be writing roughly 350GB uncompressed. And if we use a conservative 8:1 compression ratio (this can be much higher depending on the log source), that comes out to roughly $800 annually for storing this data source! How much would you guess that full fidelity logs at 1TB per day cost sitting in your SIEMs?
While the walkthrough highlighted a dual-purpose use case (similar to Windows event logs), this, of course, can be used to cost-effectively onboard new sources required to monitor IT systems and apps. Now, putting everything we ran through all together, with this approach, you get:
Want to see how this works in real life? We’d love to show you – no sales pitch, just practical solutions for the real world of public sector IT.
Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.
We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.
Experience a full version of Cribl Stream and Cribl Edge in the cloud with pre-made sources and destinations.