How to Architect Your Cribl Stream to Stream Data Flows

Last edited: March 10, 2021

We are increasingly asked if a Cribl Stream instance can be used to send to another Stream instance. The answer is a definite yes! While there are various reasons for wanting to send data from one Stream instance to another, let’s walk through just one example: collecting data in one AWS region using Stream, while sending (using compression to minimize cost) to another instance in a different region.

We will look at a comparison of various supported protocols for accomplishing this, as well as provide our recommendation. We will also discuss data flow architecture options that may be of interest for multiple Stream instances.

Recommended protocol

As with receiving data from any other source, the question of whether the data can be received at a destination is simply a matter of finding a common protocol that both the source and the destination support. In this scenario, Stream is both the sender and receiver therefore we have to find a protocol that Stream can use as a method for both receiving data and sending it. The following protocols currently satisfy this criterion:

TCP JSON
Syslog
Elastic API
Splunk HEC
Splunk TCP

While any of these five protocols are acceptable for sending data from one Stream instance to another, one protocol stands out as the best option using the features below as criteria for ranking purposes.

Name	TLS	Compression	Load Balancing	Persistent Queueing (PQ)	Cribl native	Lightweight
TCP JSON	Yes	Yes	No	Yes	Yes	Yes
syslog	Yes	No	No	Yes	No	Yes
Elastic API	Yes	Yes	No	Yes	No	No
Splunk HEC	Yes	Yes	No	Yes	No	No
Splunk LB	Yes	No	Yes	Yes	No	Yes

If you need to transfer from one LogStream instance to another within AWS, then TCP JSON is preferred over Splunk HEC and Elastic because it is lighter-weight and a native Cribl protocol, while still saving on inter-region data transfer costs via compression. And you can utilize AWS’s load balancing capabilities until load balancing is natively supported. (That feature is on the roadmap!) Without an external load balancer, the TCP JSON destination type will create a TCP socket, and remain bound to that host and port until the connection is broken. So you’ll want a load balancer between the source and the destination LogStream instances, to properly distribute across the destination’s LogStream workers.

What about the other options?

Syslog is lightweight, but Stream does not currently support compression or load balancing with syslog. Of the six features used for comparison in the table, syslog provides three of them.

Elastic and Splunk HEC are essentially equal to each other in specific functionality using the six feature criteria and both rank equal to syslog by providing three features.

Splunk LB is the only protocol that supports native load balancing, but without compression, your interregional AWS costs may be prohibitive. It surpasses syslog, Elastic API, and Splunk HEC as far as offering the most features with four.

So what’s the takeaway here? TCP JSON is a Cribl protocol that supports TLS, PQ, and compression, but without the overhead of HTTP or the Splunk TCP protocol. So it stands out as the best choice, despite lacking a load-balancing capability for now.

Did I forget some protocols?

Astute readers may ask “But what about Amazon S3, Amazon Kinesis, Apache Kafka, or Azure Event Hubs?” Stream also supports those as both Sources and Destinations, and you are welcome to use those, if one or more of them suit your needs better than TCP JSON or another protocol mentioned above.

However, the protocols highlighted above are those for which Stream supports direct host-to-host communications between the Stream Worker Nodes. These others involve an intermediary. This intermediary will cause one or more of the following: latency, extra (

Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.

We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.

Previous articleNext article