Getting Better Sysmon Data Using Cribl Stream

By

Last edited: May 5, 2022

System Monitor, better known as Sysmon, is one of my favorite security datasets. The data is crazy detailed and offers a great way to power security detection and response since it gives cyber security teams a roadmap to understand exactly what systems or people are doing while they use any Windows operating systems. The avalanche of the data is the downside and why observability engineers need tools like Cribl Stream to manage and enrich Sysmon data to make it more useful and more cost-effective.

Standard Sysmon events are very dense XML. It’s hard to read and even harder to parse.

Why Sysmon Matters

Sysmon is a widely used method to instrument Windows for security observability using a Windows system service and device driver that remains resident across system reboots to monitor and log system activity to the Windows event log. It provides detailed information about process creations, network connections, and changes to file creation time. The installation is simple but the configuration is an art. If you’re new to Sysmon, start with the configs from legends like @SwiftOnSecurity and Olaf Hartong. Both give a good starting place to get the right data to your SIEM. The data is super detailed and makes it easy to see everything being done on the host. Sysmon data is an ideal complementary data setup for EDR data to make sure you get everything.

Challenges with Sysmon

Sysmon gives visibility not otherwise available to security teams and best of all it is free, but as with everything in life nothing is truly free.

Sysmon can produce enormous data volume
Security teams can struggle to update the Sysmon config on the endpoint
Sysmon data tends to XML formatting and can be tough to shape and enrich

Security teams need tools like Cribl Stream that can make it fairly easy to not only reduce the data volume by dropping unnecessary fields, but also reshape the data to make it more efficient and more easily searched. Finally, security teams can fix issues that might be otherwise solved with Sysmon config updates using Cribl Stream if they cannot easily update the config on the endpoint. We will discuss in detail how to address these challenges with a pipeline using Cribl Stream.

How to Optimize Sysmon With Cribl Stream

I am going to assume you already have your Sysmon data flowing through Cribl Stream and into your analytics platform.

Let’s discuss the key functions in this pipeline. The most important function transforms Windows XML to JSON which is a much better format for any analytics platform.

The function also removes fields that contain ‘0x0′,’-‘ which tend to have no/low value.

Another important function is rename. It gives you the ability to update/fix file names to fit Splunk data models or help align data to a dashboard. You can fix the data inflight, which can be really useful. For example, inexperienced Splunk users should not be editing Data Models in Splunk or you can shape your data to fix the default parsers from Exabeam.

The next function gives you the ability to selectively drop fields with specific values but surface the field if interesting values occur.

For example, Keywords=0x8000000000000000 is a very common field and value in Sysmon events and as far as I am aware, the result has no value. The issue is you do not want to just drop the field for all values since it is possible you will get a value that you need. I love the ability to be this selective to get the best result and guard against dropping data you might need.

I would recommend creating a new eval for each field you want to manage so you do not get your signals crossed and fields confused.

Using this pipeline, your data goes from this:

To this:

The event size is smaller and easier to read and search.

Bottom Line

Get more value with fewer costs from Sysmon data. Cribl Stream gives engineers choice and control over their data to overcome problems big and small. Spend less time overcoming bad data and more time producing better data to drive better/faster decision-making for the SOC and support teams. I’d love to hear your feedback, so after you try Cribl Stream in our Sandbox, connect with me on LinkedIn, or join our Community Slack and let’s talk about your experience!

Ed Bailey

Ed Bailey is a passionate engineering advocate with more than 20 years of experience in instrumenting a wide variety of applications, operating systems and hardware for operations and security observability. He has spent his career working to empower users with the ability to understand their technical environment and make the right data backed decisions quickly.

View all posts

Cribl, the AI Platform for Telemetry, empowers enterprises to manage and analyze telemetry for both humans and agents with no lock-in, no data loss, no compromises. Trusted by organizations worldwide, including half of the Fortune 100, Cribl gives customers the choice, control, and flexibility to build what’s next.

We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.

Previous articleNext article