Cribl puts your IT and Security data at the center of your data management strategy and provides a one-stop shop for analyzing, collecting, processing, and routing it all at any scale. Try the Cribl suite of products and start building your data engine today!
Learn more ›Evolving demands placed on IT and Security teams are driving a new architecture for how observability data is captured, curated, and queried. This new architecture provides flexibility and control while managing the costs of increasing data volumes.
Read white paper ›Cribl Stream is a vendor-agnostic observability pipeline that gives you the flexibility to collect, reduce, enrich, normalize, and route data from any source to any destination within your existing data infrastructure.
Learn more ›Cribl Edge provides an intelligent, highly scalable edge-based data collection system for logs, metrics, and application data.
Learn more ›Cribl Search turns the traditional search process on its head, allowing users to search data in place without having to collect/store first.
Learn more ›Cribl Lake is a turnkey data lake solution that takes just minutes to get up and running — no data expertise needed. Leverage open formats, unified security with rich access controls, and central access to all IT and security data.
Learn more ›The Cribl.Cloud platform gets you up and running fast without the hassle of running infrastructure.
Learn more ›Cribl.Cloud Solution Brief
The fastest and easiest way to realize the value of an observability ecosystem.
Read Solution Brief ›Cribl Copilot gets your deployments up and running in minutes, not weeks or months.
Learn more ›AppScope gives operators the visibility they need into application behavior, metrics and events with no configuration and no agent required.
Learn more ›Explore Cribl’s Solutions by Use Cases:
Explore Cribl’s Solutions by Integrations:
Explore Cribl’s Solutions by Industry:
Watch On-Demand
Transforming Utility Operations: Enhancing Monitoring and Security Efficiency with Cribl Stream
Watch On-Demand ›Try Your Own Cribl Sandbox
Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›Get inspired by how our customers are innovating IT, security and observability. They inspire us daily!
Read Customer Stories ›Sally Beauty Holdings
Sally Beauty Swaps LogStash and Syslog-ng with Cribl.Cloud for a Resilient Security and Observability Pipeline
Read Case Study ›Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›Transform data management with Cribl, the Data Engine for IT and Security
Learn More ›Cribl Corporate Overview
Cribl makes open observability a reality, giving you the freedom and flexibility to make choices instead of compromises.
Get the Guide ›Stay up to date on all things Cribl and observability.
Visit the Newsroom ›Cribl’s leadership team has built and launched category-defining products for some of the most innovative companies in the technology sector, and is supported by the world’s most elite investors.
Meet our Leaders ›Join the Cribl herd! The smartest, funniest, most passionate goats you’ll ever meet.
Learn More ›Whether you’re just getting started or scaling up, the Cribl for Startups program gives you the tools and resources your company needs to be successful at every stage.
Learn More ›Want to learn more about Cribl from our sales experts? Send us your contact information and we’ll be in touch.
Talk to an Expert ›In this live stream discussion, Eugene Katz and I explain the importance of a quality reference architecture in successful software deployment and guide viewers on how to begin with the Cribl Stream Reference Architecture. They help users establish end-state goals, share different use cases, and help data administrators identify which parts of the reference architecture apply to their specific situation. It’s also available on our podcast feed if you want to listen on the go. If you want to automatically get every episode of the Stream Life podcast, you can subscribe on your favorite podcast app.
The Cribl Stream Reference Architecture serves as a starting point for incorporating our vendor-agnostic observability pipeline into your existing IT and Security architecture. We know firsthand how difficult it can be to onboard and deploy new tools — mistakes were certainly made when we launched back— so we designed this information to help you get 70-80% of the way to a scalable deployment of our flagship product, Cribl Stream.
It’s impossible to account for all the variability in IT, but this framework should be a useful tool in helping set up your particular environment and avoid a lot of pain points as you grow. Keep in mind that applying the considerations here within the context of your network and security architecture is just as important as any of the technical guidance.
The most important thing you can do with any new deployment or takeover of an existing deployment is to define your end state at the beginning. For something mission-critical — like your logging, telemetry, or especially security logging — you have to decide on your business objective before anything else.
Let’s say you want a scalable platform that can survive failure to a certain level — what is that level? It’s good to know the average amount of data that gets processed on a good day, but what happens on a bad day? This is a very important discussion to have with your business leaders because it’s essential for your telemetry and security to work when everything’s going badly. You have to be able to reverse engineer how many cores, systems load balancers, etc. you’ll need to have in place — otherwise, you’re just picking a number out of thin air and rolling the dice. You could also miss out on an opportunity to align with your capacity team on the amount of hardware you’ll need.
We generally recommend allocating one physical core for each 400GB/day of IN+OUT throughput. For virtual cores, you’ll need 200 GB/day, but it’ll still be the same number of worker processes. There are more details in our Sizing and Scaling documentation for Graviton vs Intel-based work processes, as well as recommendations for which VMs to choose for AWS or Azure deployments.
As far as headroom for handling data spikes goes — that’s where distributed deployment comes in. You’ll distribute not only across the different worker processes and individual worker nodes, but you’ll also have multiple worker nodes and scale out horizontally.
With Stream, you can not only pass all of your data through it, but you can also process your data along the way. You can account for more regex or turn Windows XML into JSON by using the pipeline profiling feature to run a sample and see how long the expression might be taking — just note that variations will depend on each user’s specific situation.
Big aggregations or large lookups get loaded into memory for each worker process and take up space, and each worker process gets about 2GB of memory by default. We learned about this the hard way — when we started loading in those giant lookups we suddenly started eating a whole lot more memory.
JSON is more CPU-bound than a memory-hungry application, but as you expand your use cases, you’ve got to be ready to add more memory and resources as appropriate.
Stream offers two different options for writing to disk if you have a situation where one of your destinations is experiencing an outage or slowdown. Instead of losing that data or stopping its flow altogether, you can set up a source-persistent or destination-persistent queue as a temporary solution, and once the destination is ready it will start sending those persistent events in.
Once the destination is restored, the data in a source-persistent queue will go through your whole pipeline, so it will take up a lot of resources as it flows all the way through to the destination. On the other hand, a destination-persistent queue will require fewer resources, because that data has already gone through the whole pipeline.
Destination queues are a great way to have a buffer in situations where you’re gathering data in a data center in another country and passing it into your security data lake before it’s processed. This leaves you with options in the case of failure. This is an area where your original business objectives come in — how will you size your persistent queue? Will you have an hour-long buffer, or maybe a 24-hour buffer? Be sure to think through these situations before they arise.
Managing connections is tough, especially when you’re working with thousands of data sources, universal forwarders, and pieces of network gear that need to be configured. We recommend always having load balancers available if you’re going to be working with agentless protocols like Syslog, TCP Syslog, UDP Syslog, HEC, and HTTP — but make sure you manage that connection overhead and don’t point everything at one server, or you’ll find yourself in a world of trouble.
Once you’re done balancing the load across the different workers, you have to account for the total number of connections — 400 per CPU core is manageable, but it will depend on your EPS. If you have more than 250 connections per core, then you need to start thinking about testing what’s optimal for your architecture. What is your EPS and how sustained is it? How many forwarders do you have? How fast are they writing? Do you have big senders?
A single, or all-in-one, worker group is appropriate for small-to medium-sized enterprises working with less than or near 1T of data per day. If your sources are small enough to handle spikes or are unlikely to reach capacity, then this type of architecture may be appropriate.
A setup involving multiple worker groups is necessary for larger organizations or if you have sensitive or complex data to process. The first thing that customers will do is split up pull and push worker groups. Push worker groups like data from Syslog in universal forwarders are usually consistent, but the pull side of things can be a different story. Mixing the data you’re pulling down from CrowdStrike, which has a series of huge spikes followed by no data flow, might be problematic.
Your pull sources will also be managed by the leader in terms of scheduling, so you want to make sure that you have those sources fairly close to the leader to avoid running into network latency, and potentially having skipped pulls.
These are just some of the things to consider in the design of your enterprise’s architecture. Watch the live stream on Introducing the Cribl Stream Reference Architecture to get more detail and insights on integrating Cribl Stream into any environment, enabling faster value realization with minimal effort. This is the first of many discussions on the Cribl Stream Reference Architecture, tailored to SecOps and Observability data admins. Take advantage of this opportunity to empower your observability administration skills, and stay tuned for future conversations that will dive deeper into each of the topics discussed here.
The fastest way to get started with Cribl Stream, Edge, and Search is to try the Free Cloud Sandboxes.
Experience a full version of Cribl Stream and Cribl Edge in the cloud with pre-made sources and destinations.
Classic choice. Sadly, our website is designed for all modern supported browsers like Edge, Chrome, Firefox, and Safari
Got one of those handy?