Cribl Stream users have been successfully setting up security data lakes alongside, instead of, and underneath their SIEM solutions. Regardless of their architecture, they all want to reduce their latency and cut their costs. Snowflake, a popular choice for security data lakes due to its scalability and ease of use, recently released a new streaming ingest capability that Cribl Stream is ready to unlock.
Intuitively named Snowpipe Streaming, this new data loading feature bypasses the file staging that Snowpipe had required. Instead, events are written directly as queryable rows in the target Snowflake database. Kafka is required for now as an intermediary, and the Managed Streaming for Apache Kafka (MSK) service in AWS is supported.
As a result of the move from batch to stream, latency has been reduced from minutes to seconds, while costs have been reduced by over 3x. Snowflake’s latest testing found that the ingest cost for 1 TB of data has fallen below $50 with Snowpipe Streaming.
How can you get started with the newly released integration option? Setting up streaming from Cribl to Snowflake is easy and well-documented. You can follow the Quick Start for Snowpipe Streaming with AWS MSK to take advantage of the fully-managed AWS Kafka service. Alternatively, you can use your own Kafka cluster or one managed by Confluent as described in the Snowflake Documentation.
Once you have Kafka and Snowpipe Streaming ready to receive data on the Snowflake side, you can add it as a destination for Cribl Stream. In the Cribl Stream New Destination configuration page, specify the Kafka destination details including the brokers and topics to be used for the pipeline between Cribl and Snowflake:
Once the new Kafka Destination is active in Cribl Stream, data should start arriving in your Snowflake. To confirm that records are showing up as expected, open the Snowflake UI and navigate to the destination table. If the pipeline has been successfully configured, you will see the data populating as shown below:
Now you are ready to stream your security logs from any environment into Snowflake using Cribl Stream and take control of your data. Some customers are using this integration for high-volume security sources like forensic endpoint logs and cloud activity. Snowflake doesn’t have a retention limit and so can serve as a complement to the existing SIEM. Others have used Cribl Stream to ship their data to a SOC platform that runs on top of their Snowflake, replacing the legacy SIEM entirely.
For example, a Fortune 500 consumer goods company was looking for ways to improve visibility and automation after its SIEM kept hitting ingest limits. The security team wanted to centralize its cloud, SaaS, and on-prem logs and apply data science to better detect threats- especially lateral movement across environments. They took control of their pipeline with Cribl and streamed the disparate source data to AWS. From there, a modern SIEM alternative picked up the events, normalized them, and stored them in the organization’s existing Snowflake Data Cloud. Detections could then run from the SOC platform across the normalized sources in Snowflake to detect threats that would have been missed when each source was analyzed separately.
After moving to this new architecture, the security team more than tripled the amount of security data being analyzed for threats. Retention went from 90 days to a full year of hot storage. And the total cost for all solutions involved was cut in half, saving over $1 million annually. These gains are not unusual when moving from a traditional, locked-in SIEM to an open security data lake architecture with Cribl and Snowflake.
You can find more examples on the Snowflake for Cybersecurity website, or try it out with your data by signing up for a free Cribl.Cloud account.