Cribl puts your IT and Security data at the center of your data management strategy and provides a one-stop shop for analyzing, collecting, processing, and routing it all at any scale. Try the Cribl suite of products and start building your data engine today!
Learn more ›Evolving demands placed on IT and Security teams are driving a new architecture for how observability data is captured, curated, and queried. This new architecture provides flexibility and control while managing the costs of increasing data volumes.
Read white paper ›Cribl Stream is a vendor-agnostic observability pipeline that gives you the flexibility to collect, reduce, enrich, normalize, and route data from any source to any destination within your existing data infrastructure.
Learn more ›Cribl Edge provides an intelligent, highly scalable edge-based data collection system for logs, metrics, and application data.
Learn more ›Cribl Search turns the traditional search process on its head, allowing users to search data in place without having to collect/store first.
Learn more ›Cribl Lake is a turnkey data lake solution that takes just minutes to get up and running — no data expertise needed. Leverage open formats, unified security with rich access controls, and central access to all IT and security data.
Learn more ›The Cribl.Cloud platform gets you up and running fast without the hassle of running infrastructure.
Learn more ›Cribl.Cloud Solution Brief
The fastest and easiest way to realize the value of an observability ecosystem.
Read Solution Brief ›Cribl Copilot gets your deployments up and running in minutes, not weeks or months.
Learn more ›AppScope gives operators the visibility they need into application behavior, metrics and events with no configuration and no agent required.
Learn more ›Explore Cribl’s Solutions by Use Cases:
Explore Cribl’s Solutions by Integrations:
Explore Cribl’s Solutions by Industry:
Try Your Own Cribl Sandbox
Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›Get inspired by how our customers are innovating IT, security and observability. They inspire us daily!
Read Customer Stories ›Sally Beauty Holdings
Sally Beauty Swaps LogStash and Syslog-ng with Cribl.Cloud for a Resilient Security and Observability Pipeline
Read Case Study ›Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›Transform data management with Cribl, the Data Engine for IT and Security
Learn More ›Cribl Corporate Overview
Cribl makes open observability a reality, giving you the freedom and flexibility to make choices instead of compromises.
Get the Guide ›Stay up to date on all things Cribl and observability.
Visit the Newsroom ›Cribl’s leadership team has built and launched category-defining products for some of the most innovative companies in the technology sector, and is supported by the world’s most elite investors.
Meet our Leaders ›Join the Cribl herd! The smartest, funniest, most passionate goats you’ll ever meet.
Learn More ›Whether you’re just getting started or scaling up, the Cribl for Startups program gives you the tools and resources your company needs to be successful at every stage.
Learn More ›Want to learn more about Cribl from our sales experts? Send us your contact information and we’ll be in touch.
Talk to an Expert ›March 10, 2021
We are increasingly asked if a Cribl Stream instance can be used to send to another Stream instance. The answer is a definite yes! While there are various reasons for wanting to send data from one Stream instance to another, let’s walk through just one example: collecting data in one AWS region using Stream, while sending (using compression to minimize cost) to another instance in a different region.
We will look at a comparison of various supported protocols for accomplishing this, as well as provide our recommendation. We will also discuss data flow architecture options that may be of interest for multiple Stream instances.
As with receiving data from any other source, the question of whether the data can be received at a destination is simply a matter of finding a common protocol that both the source and the destination support. In this scenario, Stream is both the sender and receiver therefore we have to find a protocol that Stream can use as a method for both receiving data and sending it. The following protocols currently satisfy this criterion:
While any of these five protocols are acceptable for sending data from one Stream instance to another, one protocol stands out as the best option using the features below as criteria for ranking purposes.
Name | TLS | Compression | Load Balancing |
Persistent Queueing (PQ) |
Cribl native | Lightweight |
---|---|---|---|---|---|---|
TCP JSON | Yes | Yes | No | Yes | Yes | Yes |
syslog | Yes | No | No | Yes | No | Yes |
Elastic API | Yes | Yes | No | Yes | No | No |
Splunk HEC | Yes | Yes | No | Yes | No | No |
Splunk LB | Yes | No | Yes | Yes | No | Yes |
If you need to transfer from one LogStream instance to another within AWS, then TCP JSON is preferred over Splunk HEC and Elastic because it is lighter-weight and a native Cribl protocol, while still saving on inter-region data transfer costs via compression. And you can utilize AWS’s load balancing capabilities until load balancing is natively supported. (That feature is on the roadmap!) Without an external load balancer, the TCP JSON destination type will create a TCP socket, and remain bound to that host and port until the connection is broken. So you’ll want a load balancer between the source and the destination LogStream instances, to properly distribute across the destination’s LogStream workers.
What about the other options?
Syslog is lightweight, but Stream does not currently support compression or load balancing with syslog. Of the six features used for comparison in the table, syslog provides three of them.
Elastic and Splunk HEC are essentially equal to each other in specific functionality using the six feature criteria and both rank equal to syslog by providing three features.
Splunk LB is the only protocol that supports native load balancing, but without compression, your interregional AWS costs may be prohibitive. It surpasses syslog, Elastic API and Splunk HEC as far as offering the most features with four.
So what’s the takeaway here? TCP JSON is a Cribl protocol that supports TLS, PQ, and compression, but without the overhead of HTTP or the Splunk TCP protocol. So it stands out as the best choice, despite lacking a load-balancing capability for now.
Astute readers may ask “But what about Amazon S3, Amazon Kinesis, Apache Kafka, or Azure Event Hubs?” Stream also supports those as both Sources and Destinations, and you are welcome to use those, if one or more of them suit your needs better than TCP JSON or another protocol mentioned above.
However, the protocols highlighted above are those for which Stream supports direct host to host communications between the Stream Worker Nodes. These others involve an intermediary. This intermediary will cause one or more of the following: latency, extra (hard and/or soft) costs, complexity, etc.
If you are considering the possibility of a Kafka variant as an acceptable intermediary because of queuing needs, keep in mind that all five protocols compared above provide dynamic PQ (i.e., PQ is only used when necessary and the need is automatically detected). By using PQ only when needed, you can save some latency by not using Kafka (even though Kafka is highly optimized for low latency) and also save the added cost of data stored on disk.
When transmitting observability data from one Stream instance to another, a few additional considerations come to mind as one digs deeper into planning: When using Worker Groups, does it matter whether each Worker Group is controlled by its own Master Node? What about licensing costs that may be incurred? Let’s address those concerns.
I’ll tackle the latter question first because it’s a simpler discussion. Cribl Stream tracks data on ingest only for purposes of licensing. You can see a 90-day historical trend graph in Monitoring > Licensing that reflects this tracking. Both the source and destination Stream instances will reflect receiving inbound data on the Monitoring > Licensing page. You can ingest data with a standalone or distributed Stream instance to use for sending to one or more additional Stream instances, without incurring additional costs within the Stream environment.
When transmitting observability data from one Stream instance to another, the configuration is really simple when using standalone instances, but questions can arise when using a distributed environment. There can be many variations within user environments. but those variations are just some combination of the two scenarios shown here.
For various reasons, users may need to have multiple Master Nodes managing one or more Worker Groups. The Worker Groups may be organized by region, data center, prod vs test, etc. In the diagram below, these groups are deployed by region, with each region having its own Master Node.
In other situations, an organization may be able to better leverage a single Master Node across multiple Worker Groups, to provide easier management and monitoring of the entire Stream environment. This data flow architecture looks like the following diagram:
With any combination of where the master exists, or how many masters there are, the result is the same. The master has no effect on how these Worker Groups process their inbound and outbound data, and this holds true whether the inbound data is from another Stream instance or from a 3rd party tool, so long as the common-protocol rule is followed. As with any data flow architecture, be aware of the TCP/UDP ports being configured to ensure the corresponding ports in your firewalls are open.
As you can see, Cribl Stream is quite versatile with how it can send and receive observability data. This versatility will only increase over time. To learn more about Stream to Stream operations, check out our video resourse.
As always, we recommend trying Cribl Stream for yourself through one of our sandbox courses. Additionally, you can download Stream and process up to 5TB/day for free.
Tomer Shvueli Sep 5, 2024
Josh Biggley Aug 28, 2024
Classic choice. Sadly, our website is designed for all modern supported browsers like Edge, Chrome, Firefox, and Safari
Got one of those handy?