In this conversation, Sanjay Shrestha, Principal Detection Engineer at Bayer, and Raanan Dagan, Principal Sales Engineer from Cribl, talk about the integration of Git in Cribl Stream. They discuss how to manage configuration files and pipelines as code, simplifying their deployment. They also share a demo and give best practices for optimizing your GitOps workflow.
In the 10+ years that Bayer has worked with Splunk, they’ve gone from processing just 80 GB/day to more than 13 TB/day. An increased data ingestion workload after moving to Splunk Cloud has also caused a considerable surge in operational costs and DDAA (Dynamic Data Active Archive) costs.
Since introducing Cribl Stream one year ago, Bayer has reduced those operational costs by 30% and DDAA costs by 70%. They leverage Cribl-to-Cribl integration to avoid double licensing costs, and take advantage of the vendor-agnostic nature of Cribl Stream to avoid SIEM vendor lock-in — routing logs to both Splunk and Exabeam. Bayer is also leveraging the integration of Git in Cribl to manage the configurations and data pipelines in their architecture.
Bayer’s architecture consists of two distinct worker groups deployed across several AWS regions — one for the US and the other for the EMEA region. They are further categorized into distinct public and private groups. The private worker group ingests logs from their internal network, and the public worker group is open to third-party vendor’s audit logs.
The diagram above also shows a third-party vendor in the US sending logs to Bayer’s Cribl instance, and illustrates the Cribl-to-Cribl integration. Logs are sent to Bayer via the data center’s Cribl Stream instance and then from Bayer’s Cribl instance on AWS to Splunk Cloud.
The continuous, efficient flow of Bayer’s data is made possible by the integration between Cribl and Git. The GitOps operational model encompasses methodologies like version control, CI/CD, monitoring, and infrastructure automation — Git capabilities in Cribl Stream allow for the continuous addition of pipelines, Cribl Packs, sources, and destinations.
The integration provides an environment where you can continuously evolve the development-to-production cycle of your data pipelines and leverage a remote Git repository. Adjusting the Git settings in Cribl Stream will simplify and synchronize the dev → prod cycle, and create an environment where only approved changes make it into production.
Switch the GitOps workflow for your prod environment to push, and it will become read-only — meaning that changes will only be permitted in the dev environment. They’ll get committed to a dev branch of your repository, and then changes approved by production move to a prod branch to be merged and deployed.
You can also incorporate the Git methodology into specific sources, destinations, and routes. An option in Cribl Stream allows you to designate environments as dev or prod, so everything in those environments behaves accordingly. If a source, destination, or route within an environment doesn’t match the dev or prod bucket it’s supposed to, Cribl will automatically make it inactive. You can reactivate them by simply switching the designation.
Here are some of the strategies and techniques Bayer has used to make the most of the Git Ops methodology within Cribl Stream.
Be sure to watch the whole presentation for a demo of Cribl Stream and Git in action, and to learn more about how they can be used together to simplify the deployment and management of data pipelines.
Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.
We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.