Products
Product Portfolio

Cribl puts your IT and Security data at the center of your data management strategy and provides a one-stop shop for analyzing, collecting, processing, and routing it all at any scale. Try the Cribl suite of products and start building your data engine today!
Learn more ›

Evolving demands placed on IT and Security teams are driving a new architecture for how observability data is captured, curated, and queried. This new architecture provides flexibility and control while managing the costs of increasing data volumes.
Read white paper ›

Cribl Stream

Cribl Stream is a vendor-agnostic observability pipeline that gives you the flexibility to collect, reduce, enrich, normalize, and route data from any source to any destination within your existing data infrastructure.
Learn more ›

Vodafone Case Study

Vodafone Dials up Business Insights with Cribl Stream
Read Case Study ›

Cribl Edge

Cribl Edge provides an intelligent, highly scalable edge-based data collection system for logs, metrics, and application data.
Learn more ›

SpyCloud Edge Story

Listen to how SpyCloud uses Cribl Edge at scale.
Watch Video ›

Cribl Search

Cribl Search turns the traditional search process on its head, allowing users to search data in place without having to collect/store first.
Learn more ›

How Cribl Search Can Save You From Drowning in a Deluge of Data
Read Blog ›

Cribl Lake

Cribl Lake is a turnkey data lake solution that takes just minutes to get up and running — no data expertise needed. Leverage open formats, unified security with rich access controls, and central access to all IT and security data.
Learn more ›

Navigating the future of IT and Security Data management white paper
Read white paper ›

Cribl.Cloud

The Cribl.Cloud platform gets you up and running fast without the hassle of running infrastructure.
Learn more ›

Cribl.Cloud Solution Brief

The fastest and easiest way to realize the value of an observability ecosystem.
Read Solution Brief ›

Cribl Copilot

Cribl Copilot gets your deployments up and running in minutes, not weeks or months.
Learn more ›

Cribl Copilot

Your Trusted AI Advisor for Deploying, Configuring & Troubleshooting.
Read blog ›

AppScope

AppScope gives operators the visibility they need into application behavior, metrics and events with no configuration and no agent required.
Learn more ›

Sandbox

Launch an AppScope Sandbox today!
Launch Now ›
Solutions
Use Cases

Explore Cribl’s Solutions by Use Cases:

Supercharge Security Insights ›

Accelerate Cloud Migration ›

Avoid Vendor Lock-in ›

Agent Consolidation ›

Slash Storage Costs ›

Free Up Space for High-Value Data ›

Route From Any Source To Any Destination ›

Immediate Access to Archived Data ›

Replay Data from Low-Cost Storage ›

Reduce Log Volume & Pay Less for Infrastructure ›
Integration

Explore Cribl’s Solutions by Integrations:

Amazon ›

CrowdStrike ›

Elastic ›

Exabeam ›

Google ›

Microsoft ›

Splunk ›

Wiz ›

View All Integrations ›

Seamless Integrations for Your Observability Data
Learn More ›
Industries

Explore Cribl’s Solutions by Industry:

AIOps ›

Financial Services ›

Healthcare ›

Managed Security Services ›

Manufacturing and Logistics ›

Media and Entertainment ›

Public Sector ›

Retail ›
Resources
Resources

Resource Library ›

Documentation ›

Guides ›

AppScope Docs ›

Blog ›

Glossary ›

Podcasts ›

Telemetry 101

Understanding the Basics of Telemetry and Its Benefits
Learn More ›
Events & Webinars

Events ›

Webinars ›

CriblCon24
Watch On-Demand ›

July 31 | 10am PT / 1pm ET

Navigating the Data Current Report: Transforming IT & Security Operations in 2024
Register ›
Learning

Try the Sandboxes ›

Self Guided Trials ›

Cribl University ›

Cribl Community ›

Cribl Curious Forum ›

What is Observability? ›

Try Your Own Cribl Sandbox

Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›
Tools & Pricing

Download Library ›

Past Releases ›

Pricing Plans ›

Stream ROI Calculator ›

Download Library

Download Cribl’s suite of products for free to get started.
Download ›
Customers
Customer Stories

Get inspired by how our customers are innovating IT, security and observability. They inspire us daily!
Read Customer Stories ›

Sally Beauty Holdings

Sally Beauty Swaps LogStash and Syslog-ng with Cribl.Cloud for a Resilient Security and Observability Pipeline
Read Case Study ›
Customer Experience

Support & Success ›

Professional Services ›

Service Delivery Partners ›

Documentation ›

AppScope Docs ›

Professional Services

Check out our new Professional Services offering.
Learn More ›
Learning

Try the Sandboxes ›

Self Guided Trials ›

Cribl University ›

Cribl Community ›

Cribl Curious Forum ›

Try Your Own Cribl Sandbox

Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›
Company
About Cribl

Transform data management with Cribl, the Data Engine for IT and Security
Learn More ›

Cribl Corporate Overview

Cribl makes open observability a reality, giving you the freedom and flexibility to make choices instead of compromises.
Get the Guide ›

Cribl Newsroom

Stay up to date on all things Cribl and observability.
Visit the Newsroom ›

Press Releases

Read our most recent press releases.
Recent Press Releases ›

Leadership

Cribl’s leadership team has built and launched category-defining products for some of the most innovative companies in the technology sector, and is supported by the world’s most elite investors.
Meet our Leaders ›

Careers

Join the Cribl herd! The smartest, funniest, most passionate goats you’ll ever meet.
Learn More ›

Cribl Named to the Inc. 5000 List of Fastest Growing Private Companies
Learn More ›

Cribl for Startups

Whether you’re just getting started or scaling up, the Cribl for Startups program gives you the tools and resources your company needs to be successful at every stage.
Learn More ›

Contact Us

Want to learn more about Cribl from our sales experts? Send us your contact information and we’ll be in touch.
Talk to an Expert ›

Try Cribl Talk to an expert

Cribl Reference Architecture Series: Scaling Effectively for a High Volume of Agents

September 18, 2023

Categories: Learn

Back To Blogs

In this livestream, Cribl’s Ahmed Kira and I explore the challenges of scaling your Cribl Stream architecture to accommodate a large number of agents, providing valuable insights on what you need to consider when expanding your Cribl Stream deployment.

Managing data flows from a high volume of agents presents a unique set of challenges that need to be addressed. Organizations need to meet business resiliency requirements and ensure the reliable transmission of data from endpoints to their analytics systems.

The Cribl Stream Reference Architectures can help you set up your infrastructure to handle those high-volume sources. Architectural considerations for Cribl environments are typically centered around daily volumes of data — but if you have tens of thousands of agents communicating directly with Cribl workers, you also have to consider the ratio of agents to Cribl worker processes.

In today’s distributed world, data comes from everywhere — from servers to workstations, laptops, and IoT devices. Every one of those agents is establishing, or opening and closing, TCP connections. But a process on a Linux host can only handle so many TCP connections coming in, so keep a close eye on your connection overhead.

Partitioning Workloads

When collecting from a large volume of agents, you want to keep high-volume agents dedicated to their own worker group. Putting these worker groups by location helps reduce latency — if you have a data center with tens of thousands of virtual machines that will be talking to Cribl, that location can be its own worker group.

By separating it and not having Syslog or any other high-volume sources on the same worker group, any changes you make won’t affect any other protocols. You also get the ability to fail small — in a single data center instead of in many of them. Separating workloads makes it easier to monitor, manage, update, and scale your deployment.

Be Careful Not to Overwhelm Your Destinations

With this kind of architecture, you want to consider the workload on your destinations. For example, if you have a worker group talking to a Splunk indexer cluster, it generates TCP connections from every worker process to each Splunk indexer. For these kinds of destinations, the max connection setting needs to be tuned so that you don’t overwhelm or create a bottleneck for your destination.

Using Cribl Stream creates better throughput, so bottlenecks will likely be made or moved closer to the indexers and cause a problem. With Stream, there are plenty of options to manage that, but keep this in mind so you don’t just shift problems from one place to another.

Send Data to Multiple Destinations to Comply With Privacy Requirements

With the breakout of different worker groups in Cribl Stream, you have the option to send data to multiple data lakes and to your analytics tools. This could be especially useful for international deployments with different data sovereignty and privacy requirements.

If you’re picking up endpoint data in the EU, then you’ll fall in scope under GDPR. In this scenario, having a workstation in the EU gives you way more options than you would have if you were trying to homerun that data back to the US.

If you’re handling other types of sensitive data like PHI or PII, have some guardrails in place. Put at least two workers in your worker group to account for HA, regardless of how little data is flowing. After you use your calculators to size your deployment, add an extra worker group (N+1) to account for bursts in throughput.

Oversizing for Failure

When you work with the team at Cribl to set up your architecture, they’ll typically recommend sizing to handle 150% of your planned data. But if you have business resilience requirements that require you to sustain more than 1.5 times your daily average, then you need to consider upscaling even further..

We make that as easy as possible from an administrative standpoint, which is one of the reasons I fell in love with Cribl right away as a customer. If you have a worker group and want to add another server, it takes the same code as the other worker groups. Managing fleets and subfleets instead of individual pieces makes things simple — take advantage so you don’t end up pointing all your endpoints at one server.

Determining EPS and How Many Agents/vCPU

One Cribl worker process can handle as many as 5000 very low EPS agents. So if you have a Cribl worker with 14 worker processes on a 16 CPU system, that one Cribl worker can handle all 70,000 agents.

But let’s be honest — how many of your agents generate less than three events per second? Maybe some, but not many. For most deployments, a volume of 30 EPS and 250 agents per vCPU is more appropriate as a baseline. It’s a much lower, but intentionally conservative starting point.

Here are the guidelines based on events per second from different senders — you can find more information in our Multiple Agents Reference Architecture Documentation.

We assume three tiers of “chatty” agents, based on events per second. You’ll probably recognize your senders from these definitions:

Chatty agents (100 EPS/agent) – Size 150 agents/vCPU. (Examples are domain controllers or intermediary agents.)

Medium-chatty agents (30 EPS/agent) – Size 250 agents/vCPU. (Most servers will fall into this medium category.)

Low-volume agents (3 EPS/agent) – Size 5000 agents/vCPU (Examples are workstations.)

For the most accurate sizing, obtain EPS reports from your current observability tools.

Load Balancing Considerations

Load balancer configuration is especially important for agents like Fluent Bit, Fluentd, and others that support HTTP or AGC delivery — because they send data to a load balancer in front of the Cribl workers. Cribl tools work better without sticky sessions, so that data is distributed across different workers.

For the most part, agents support auto load-balancing with their native protocols, so take advantage of that whenever you can. Don’t put data from a Splunk Universal Forwarder through a load balancer using S2S unless you’re completely against getting a nice distribution of data.

Cribl Edge also has a setting for load balancing that makes it easy to get the type of scale you’re looking for. You’ll be able to engage all your workers, get an even distribution, and have the ability to failover if there’s a problem.

Cribl’s Reference Architectures are a starting point to get you 75% of the way towards deploying Cribl Stream. At that point, you can consult with the team at Cribl to adjust for your own unique requirements and make sure all other odds and ends are accounted for. Think about these things before you start so you can get as much value with as few problems as possible.

Watch the full livestream for more details on the keys to achieving a seamless, continuous flow of data from your endpoints to your destinations — including considerations for different amounts of throughput/agents and an example of how we would recommend deploying Stream as a retailer sending data to their SIEM.

The updated Cribl Reference Architecture equips administrators with the tools and guidance to tackle these issues proactively, helping to prevent potential disruptions to your business operations.

Here are some of the other live streams in our Reference Architecture Series to help you get started implementing Cribl Stream:

Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.

We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.

Blog

Preventing Friction With an Impactful Security Champions Program

Blog

From Necessity to Opportunity: The Customer Push for SIEM Options

Blog

Securing the Foundation of Cribl Copilot

Try Your Own Cribl Sandbox

Experience a full version of Cribl Stream and Cribl Edge in the cloud with pre-made sources and destinations.

Launch Now

Product Portfolio

Cribl Stream

Cribl Edge

Cribl Search

Cribl Lake

Cribl.Cloud

Cribl Copilot

AppScope

Use Cases

Integration

Industries

Resources

Events & Webinars

Learning

Tools & Pricing

Download Library

Customer Stories

Customer Experience

Learning

Try Your Own Cribl Sandbox

About Cribl

Cribl Newsroom

Leadership

Careers

Cribl for Startups

Contact Us

Cribl Reference Architecture Series: Scaling Effectively for a High Volume of Agents

Partitioning Workloads

Be Careful Not to Overwhelm Your Destinations

Send Data to Multiple Destinations to Comply With Privacy Requirements

Oversizing for Failure

Determining EPS and How Many Agents/vCPU

Load Balancing Considerations

Blog

Preventing Friction With an Impactful Security Champions Program

Blog

From Necessity to Opportunity: The Customer Push for SIEM Options

Blog

Securing the Foundation of Cribl Copilot

Try Your Own Cribl Sandbox

So you're rockin' Internet Explorer!