February 2, 2021
Building your observability pipeline requires tools with awareness of your environment, data, and priorities.
Today’s ITOps and SecOps teams struggle to select the right technologies when implementing their observability pipeline. Many teams default to open source options, believing they can build out the capabilities they need. Others may lean on tooling offered by an incumbent log analytics vendor.
We’ve seen our customers try each of these. Here’s why we believe Cribl Stream is the right choice for your observability platform.
We love open source at Cribl (if you do too, we’re hiring!), but it’s not always the right fit for ITOps and SecOps teams. Open source projects, like Kafka, Pulsar, and Flink, provide a foundation for you to build upon. That’s great for engineering teams crafting new products, or for data managers creating yet another big data environment. It’s less than ideal for busy operations teams.
The critical challenge in building on open source is awareness of the data types operations teams have to collect, refine and manage. Data floods in from firewalls, containers, SNMP traps, and HTTP sources. You also need to fetch data from object stores, multiple activity hubs, Kafka, and other messaging sources. No open source project supports the variety and volumes of data required in a modern observability pipeline.
Using open source means building every element of the data processing pipeline. Adding to the complexity, projects like Kafka and Pulsar push around bytes. Converting those bytes to events is code that must be written for each source. You’ll also need essential features like per-source backpressure, support for a range of protocols, role and permission management, and so on. That massive investment of people and time has little hope of showing meaningful returns.
Cribl Stream, on the other hand, knows the data you’re struggling with. It knows events. It understands every field and allows your operations team to enrich, refine, and place that data wherever you need it. And if you want to replay that data to ask new questions, LogStream lets you do that too.
The common pattern for operations teams experimenting with open source is:
At roughly the six-month mark, most teams haven’t made much progress. Some data-related challenges have been solved, but bigger challenges around manageability, operations, governance and security remain unresolved.
Open source projects don’t ship with these capabilities for two reasons. First, they’re hard to build. Implementing management, security and governance is tedious. Second, open source commercializers build these features as proprietary add-ons because enterprises willingly pay for them.
Those “open core” companies wrapping things like Flink or Spark Streaming might be an option for the initial adoption phase. While initially appealing, these tools aren’t designed with operations teams in mind, leaving all of the previously stated data challenges unresolved.
The talent and skills poured into creating an enterprise solution are expensive. Depending on the market, data engineers with a few years of experience can have salaries ranging from $90,000 to $180,000. DevOps engineers or infrastructure experts with experience in Apache Kafka, Flink, Nifi, or Spark are often more expensive. Salaries of $200,000 or more are common, and that’s if you can even find that talent in your local market.
Three to four engineers, at a minimum, spending six months creating an internal product that isn’t competitively differentiating is a poor use of time. Add in hardware costs or cloud infrastructure expenses, and you’re easily looking at $300,000-400,000 for your open source observability pipeline. Add in ongoing maintenance, and financial costs really add up.
Money aside, the real impact on your company is the opportunity cost of building an observability pipeline. That time is better spent on competitively differentiating work for your enterprise, not on infrastructure.
LogStream is built by a team with over 30 years of combined experience creating products for operations teams. It provides visibility into the data, supports role-based access control and fine-grained management. If you can’t observe your observability pipeline, what value can it really offer?
With the abundance of log analytics tools available in the market, another option for operations teams is using a product from their incumbent log analytics vendor. This can be a great option if:
It’s unlikely either of those are true. If your log volumes aren’t growing, neither is your business (and you have bigger problems). Every one of our customers faces uncontrolled growth of logging data, but many turn off log aggregation because budgets aren’t keeping pace with logging expansion.
Processing observability data for insight is a critical priority, and processing options come in a variety of shapes and sizes. The days of dropping every log, metric and trace into a single data store are over. Today’s enterprises want to use multiple data and observability platforms to serve a growing number of data consumers. Data warehouses provide targeted, optimized analytics for a broadest possible set of data consumers, while data lakes support exploration and what-if scenario planning.
A common refrain from log analytics vendors is their pipelines use machine learning to determine what logs are trash and which are treasure, saving you money on ingestion volume cost. Those machine learning algorithms exist in a black box. You have no influence over what they keep or reject. The data you were counting on may suddenly stop flowing, crippling analysis.
And since ingest volume is how these companies make money, it’s unlikely they’ll reject much data. This keeps infrastructure and licensing costs high.
Contrast that with LogStream, which puts you in charge of what to keep or reject, what to enrich or reduce. LogStream works with your existing log analytics platform, but allows you to design and implement your observability pipeline your way, for your data.
Data is the single largest untapped asset in most companies. Being data-driven means having accessible data. Choosing a single observability platform for your pipeline and your log analytics tool locks your data into a single silo. That data might be accessible to the operations team, but it is cut off from other data consumers, like DevOps teams analyzing application performance or finance teams planning budgets.
Cribl Stream manages log, metric and trace data for the operations team and any other team that needs access to it – without creating another impenetrable data silo.