LogStream is now available as a Cloud service! Learn More

The Stream Life Episode 002: How does Cribl fit into the Observability ecosystem?

Abby Strong
Written by Abby Strong

November 16, 2020

Observability is the ability to create a view into systems so you can monitor their state from outside the system itself – or in other words, the ability to answer questions you may not have planned for in advance – everything from “What did this user do yesterday?” to “how did the application perform?” and “Was the user experience good?” In this episode, we’ll explain how Cribl fits into the Observability ecosystem to get you the most out of your data.

If you want to get every episode of the Stream Life podcast automatically, you can subscribe on Apple Podcasts, Spotify, Pocket Casts, Overcast, CastroRSS, or wherever you get your podcasts.

Transcript

Abby Strong:          

Well, hello, everybody. Welcome to Cribl’s: The Stream Life podcast. This is Abby Strong, VP of marketing here at Cribl, and I am excited to be joined by Clint Sharp, CEO and co-founder of Cribl today. And we’re here to talk about observability. So broadly observability is the ability to create a view into systems so you can monitor their state from outside the system itself. Or in other words, the ability to answer questions you may not have planned for in advance, everything from what did this user do yesterday to how did the application perform, or was the user experience good? So Clint, can you help us understand how Cribl fits into the observability ecosystem?

Clint Sharp:

Yeah, absolutely. It takes a lot of different ways of storing data to answer all of those questions. So you may want to be able to, for example, have a dashboard full of metrics that people are looking at to just get a quick overview of system health. You may need to go from there to start diving through all of your data to understand why one of those metrics is going awry, and you may need to go back in time and try to understand longer-term trends of questions that you had not really planned for when you built that dashboard the first time. And all of those different motions require us to store data in different ways. If I want to be able to have a dashboard full of metrics, I want to be able to have lots and lots and lots of metrics and I want to be able to have them retrieved quickly because I have a user in front of a dashboard.

Clint Sharp: 

And so you tend to see data stores like a time series database for that. If I want to investigate end user performance, I need the ability to dive through high cardinality data. So you might be ending up in a log storage system or an event analytics system. If I need to be able to go back in time and do fundamentally different aggregations or ask different questions over large corpuses of data, then you’re probably trying to optimize for cost and you’re trying to just have a repository or a data lakehouse where you can dump all of this data. And unfortunately, where we gathered this data from, it’s all the same places. So if I want data to go to three different places or four different places, I don’t want to have to go throw three or four different ways of collecting the data out to the end point.

Clint Sharp:  

I fundamentally want to be able to get the right data to the right place at the right time in the right shape. And what the Cribl founders observed is that everybody has got all these tools, but nothing really helps connect them together. And you can kind of think about the observability space as similarly in evolution to the BI space. When I started in this industry, we just had one database. And when we wanted to do reporting, we went into the Oracle database to do reporting. When the application needed to write stuff, you wrote stuff into that same Oracle database or SQL server database or whatever, and that created performance challenges because people were running big queries against these databases and so, “Oh, hey, let’s create a reporting database, a separate analytical processing database separate from our transaction processing database,” which then created, “Oh, crap, I got to have the same data in two places. So what do I do for that?”

Clint Sharp:

And that’s really where you started to see the emergence of ETL as being a core of data movement inside of the business intelligence space. And observability has come a long word. Now we have a bunch of different data stores that that are optimized for particular use cases, but we need to get the same data to multiple places in multiple different shapes and Cribl and our luxury product, the observability pipeline is really the connective tissue that we need in this market to be able to make sure that I don’t have to install four or five or 10 different collectors on the end point to get fundamentally the same data that’s going to end up in a ton of different data stores.

Abby Strong: 

That is great. And I think you’ve touched on this a little bit, but I’d really like to be very clear on this because obviously there are a lot of pipelines out there already for each of these tools. Why is Cribl different than the ones that already exist for the individual tool?

Clint Sharp: 

Sure. People have been getting by for a long time. If you’re a Splunk customer, for example, they have an ingestion pipeline that they built where you roll out forwarders out to your infrastructure and it works really, really well. The same thing, the time series database guys have their agents, nothing wrong with that. If you were only using that one tool, then it’s absolutely great. But what we’ve observed from our customers is they want the flexibility at the time the data is moving to pick and choose. And maybe even after the data is laid to rest to also reprocess the data and replay the data and really have that infinite flexibility and single vendor pipelines are phenomenal on day one, because really when I’m starting playing around with a logging product or a metrics product, the thing I want to do is just want to get data in and I want to see if that tool is valuable and then I want to expand my usage and roll it out throughout the enterprise and start to get value.

Clint Sharp:

Day two is really where we start to see struggles because, okay, well now I’m starting to expand my use and I’m starting to look at, okay, well now I’m going to have to roll out new agents out to all of the end points and that’s a huge, huge struggle. And so having a vendor neutral pipeline means that, for Cribl in particular, we’re really the first place where we’re saying, “Look, we don’t even really care how that data got produced, whether it came out of your application, whether it was forwarded by an agent, whether it was one vendor’s agent or another vendor’s agent.” Really we are agnostic to the source and we’re agnostic to the destination, and so we’re solving a problem of allowing people to really reuse anything that they have. It should not matter whether they made a choice to go with one vendor’s particular agent, that if they want to get that data into a completely different data store, even if it wasn’t originally intended to go there, we’re giving them that flexibility that just really hasn’t existed in the market before.

Abby Strong:

And I’m just going to push on that a little bit because, obviously I’m relatively new to this space, but I’m sure that I’ve heard of data streaming tools like Kafka or Kinesis. So can you touch on that a little bit? How does LogStream really differentiate from existing data streaming tools that are on the market?

Clint Sharp:

There’s been a lot of great innovation and event streaming over the last decade. No one could deny the force that is Kafka and Confluent the company as well behind the broader technology. But those technologies tend to be built and sold to developers, whereas the personality that we’re going after is really IT and security, operations and people who are securing the contents of this data. And they’re not looking to build a streams processing engine, they’re looking for a solution that allows them to route observability data to process observability data and security data in a no-code way. It seems like, well, couldn’t I just build that? You can. We see a lot of built shops successful with their own pipelines. We tend not to sell it because a lot of people have already solved this challenge by integrating open source componentry and maybe building data processing and spark streaming or things of that nature to get the data reshaped.

Clint Sharp:

But it’s a big investment. And when you look at the people who have built it, they will have teams of people working on their data ingestion pipeline, and fundamentally, most of the shops that we’re talking to, the tools team has maybe two to five people and they have to run their logging tool, they have to run an incident management tool, they have to run a event correlation tool, they have to run security tools like SIEM and UEBA. So getting data processed, getting a stream process is just one part of their job. It’s one aspect of a job to be done. So we really focus on providing an end to end solution. Kafka is phenomenal at getting bits from point A to point B, but it’s also not particularly … most of the other streams processing tools in the market are also not very good with log data and gritty, ugly observability data.

Clint Sharp:

So in order, for example, to take a byte stream that’s typically going into a log tool that’s really just a bunch of texts describing some things that have happened and get it well structured in a way that’s going to work well in a schema-on-write system, like a data lakehouse or even Elasticsearch for that matter, you’re going to have to do some parsing in the stream. You’re going to have to do some enrichment and reshaping and really getting the data in a shape that’s going to work really well for a system it wasn’t originally intended to go into. So both from a user experience perspective, we make this super, super easy to do for the persona that we’re going after, for the IT and security persona. It’s an end-to-end solution. It’s also a hyper scalable solution that’s working on significantly lower amounts of infrastructure that is typical in data processing or more generic event streaming type of systems. And just in general, the summary, it is a solution for IT and security professionals to integrate data sources, as opposed to a set of widget kits that you kind of have to assemble yourself.

Abby Strong:

That’s great. So it’s a very typical build versus buy conversation where … when you choose open source, you’ve got to deal with the custom development and the lack of really dedicated enterprise support and things like that. In addition to understanding that folks may want to invest in a BI type tool like LogStream, is there additional value that LogStream can provide on top of what you could get from one of these open source options?

Clint Sharp:

The first thing I would say is protocol support. The job to be done is working with the systems you already have. And so that means working with agents like Elastics Agents or Fluentd or Logstash or Splunk’s agents like the universal forwarder. You really just want to be able to have something that you can point all of your data collectors at and just have it work out of the box. And that’s just not something you’re going to get in a roll your own type of solution. The second key aspect that that’s kind of above and beyond is, it is a full solution. So we’ve been talking about observability, but the other key part of doing this job is I need to be able to actually observe the pipeline itself, like what data is flowing through the pipeline, what shape is it in, monitoring of that pipeline. So is data flowing right now? Is it flowing to the right places? Is it flowing at the right volumes? So having a full solution that allows for management as well. So how do we make changes? How do we see who made the changes? How do we deploy those changes in a way in which does not disrupt the operations of the existing pipeline, all super critical to actually solving the full job to be done, which is streaming this data and observability from point A to point B.

Abby Strong:

It’s interesting, just in listening to you talk. One of the things that you touched on a couple of minutes ago was about LogStream really being purpose-built for log data, for machine data and logs and metrics and traces and things. So if you’re comparing it back to Kafka or StreamSets or one of those other existing data streaming tools, can it still support the other types of data or is it only for machine data?

Clint Sharp:  

Yeah, that’s a really great question, because no, it’s not designed to replace General-Purpose Stream’s engine. So at the core of LogStream is a concept of an event and that’s a tag bag of key value pairs. And we don’t allow you to just move bytes. Now, General-Purpose Stream’s processing engines do allow you to just move bytes. And if you’re trying to … a system like a NiFi, for example, can be used to pick up an image file at the edge and deposit it in an S3 and that’s not a use case that we’re going to go after. That’s not what we do. But the trade-offs for that are because we’re focused on that concept and event, it pervades all throughout the user experience. So it’s super easy to understand and events are a concept that people who work on log and metric systems innately understand it. It looks exactly like it looks in the log system.

Clint Sharp:

So I know what I’m working on. I’m not working on just a generic set of data. I have an opinionated view that an event is the thing that I’m going to be working on. And an event has a timestamp and an event has metadata associated with it, like a source and a host that it came from, and that I can make decisions based off of that metadata, where that data is supposed to go. And the other thing about being an opinionated solution is that we don’t default for example to going to disk. And this is a key problem that something like Kafka is solving, and it’s a wonderful technology. And if you are integrating dozens and dozens of different applications, all of which you need to be pulling from the same data stream in an architecture with a lot of integrations, a lot of times what happened before Kafka and before systems like it is that you could only go as fast as your slowest consumer.

Clint Sharp: 

So if you have eight or 10 consumers, it’s a really hard problem to solve, like which one is slowing me down and I really want people to be able to have a different performance of consumers and Kafka is wonderful for solving that problem, but it solves that problem by writing all data to disk and then having differing people understand where they’re at in that, they call it a cursor, like where am I at in that byte stream and how many events am I current or am I behind, which solves that key challenge, but it also adds a real performance aspect of that because now literally in the customers that we’re selling to, they’re often moving orders of magnitude, larger amounts of data than they’re doing in their business systems.

Clint Sharp:

They’re moving terabytes or potentially petabytes of data per day. And so it also has a real ROI aspect of like, “Hey, by choosing an architecture that doesn’t default to disc, we can go to disk if the destination is offline, then we have a queuing feature that’ll make sure that we don’t lose data and queue to disk when the system is not available, but by defaulting to not go to disk, the ROI on our use cases is significantly better. We require a lot less hardware than a NiFi or a Kafka or whatever else.” And when you’re getting into the terabytes of data per day, this potentially has hundreds of thousands of dollars a year cost differences in those architectures. So being fit for purpose, what that means is looking at first principles and saying, “What problem are we actually trying to solve?”

Clint Sharp:

Let’s not just say, “Oh, I need to move data so I’m going to go get Kafka.” Kafka’s a really great solution. It’s just not the right solution for this particular problem and being able to move a byte stream is a really great possibility or a really great capability, but in this case, since the users think in events we should give them a data model and a mental model and a way to think about this that is innate to the particular problem that they have. And I think being really different, that’s why when we go into the observability space, when we go into the security space, our customers look in and say, “You got it. You really understand what I’m trying to do here.” And yes, we do. We do really understand what you’re trying to do, and we’re not going after those other problems. We’re only going after this one.

Abby Strong:

Yeah. It’s interesting. With the network background that I bring to this team, of course I have heard for a long time about the differences in the display after a firmware upgrade or something on a piece of infrastructure that then completely changes whether or not you’re even able to see what’s going on it within log information that that’s shown up. And so I know one of the words I heard you use in early days here at Cribl was poly-structured. Can you talk to me a little bit about what poly-structured data is and what that means in our machine data world?

Clint Sharp:

Yeah, absolutely. It comes into this fundamental tension in data about schema-on-write and schema-on-read. So when do I make decisions about how this data needs to be stored and be processed? In observability and security, fundamentally we just want to get the data somewhere and stored in case we end up needing it. So we really don’t want to do a whole bunch of upfront planning. I really want to develop or just write a log. Just give me the information, I don’t care what shape it’s in, better to have the information than not have the information. But because we’re not being really rigid about schema, and we’re not saying, “This is exactly the shape I want this data in because we don’t want to burden people with all that thinking in advance. We just want to get the data out.”

Clint Sharp:

What that means is that the data tends to change a lot. From release to release, we add new fields, we remove fields, we change the structure of the log message, we add new logging messages, we add new metrics that are being admitted and stored and so when we say poly-structured, what we really mean is the data has a lot of different shapes. So coming out of a given application in a normal BI type of world, you’ll be working in tens to hundreds of data shapes, maybe a thousand data shapes is a pretty big data warehouse. You can find a thousand data shapes on one Linux box just by looking at the logs, each individual log has a different shape. So we really need to be able to be agnostic. We call ourselves schema-agnostic.

Clint Sharp:

We don’t care what shape the data is in, we’re going to bring it in and we’re going to get it out. You can work with it in whatever shape it is. And poly-structured really just means that that the data comes in many, many, many different shapes. It’s not unstructured, it’s not documents. It’s not word doc. So when you say unstructured data, that’s really word docs. That’s human generated data and it’s not really structured data either in the sense that it doesn’t have a really rigid schema and we’re not trying to fit it in a certain set of schema-on-write tables. So poly-structure is kind of that thing in the middle where the data has many, many, many different shapes. It is likely to change. It is likely to have variability and emergent properties that are going to be changing.

Clint Sharp:

And that also drives a lot of design decisions about product. Working with this class of data is different than working with very structured data. It’s also very different than working with document type of data. So that’s back to that fit for purpose, why observability and security systems need their own tooling is because fundamentally these are different challenges. And so we as engineers and product designers and people going after these problems, when we start with first principles, we come out with a solution that actually does that job really, really well and ends up, for our personas, ultimately being a solution that trumps trying to take any general purpose solution and go after the problem.

Abby Strong:

So if I understand you correctly, what I just heard is that LogStream is pretty much a universal receiver for all types of poly-structured machine data. It’s a universal router to any of the observability tools that users might have in their environment and it can take anything that it receives and adapt and send it in the right format to the right tool at any time. That true?

Clint Sharp: 

It is all of that. It is all of that. And the one thing I’ll add is it’s also a universal collector, which is an often overlooked problem in that, “Hey, the data that I want to get is actually in that API,” or, “I need to SSH into this box every five minutes and go collect this data.” And those are often little scripts that our customers have written that they have to run and they run on some little machine. So yes, universal receiver, universal router, but also universal collector, which is another important aspect.

Abby Strong:

Wow. That sounds like magic. I’m excited. If you guys are listening to the podcast, it’s probably time to take a look at LogStream, which you can do by visiting us www.cribl.io and take a look at the sandbox where you can see this in action. Thank you so much, Clint, for joining us again today. And thanks everybody for listening to The Stream Life. Have a great day.

 

Additional Reading
Announcing LogStream Cloud

Clint Sharp Oct 15, 2020

Questions about our technology? We’d love to chat with you.