February 24, 2022
In Cribl Stream 3.0, we introduced a framework that provides a way for Stream customers to build, reuse, and share configuration modules – including pipelines, lookups, data samples, and knowledge objects – called Packs. While each Pack has its own “context” containing custom pipelines, routes, lookups, variables, etc., it still retains access to built-in Stream configuration that is shipped with the product. This allows you to, for example, reference built-in Grok patterns in a Pack pipeline function.
Designing and implementing Packs has been one of our largest endeavors yet, and it wasn’t without challenges. In this post, we will share one specific problem we faced and how we resolved it.
One of the unspoken tenets of philosophy at Cribl is simplicity. We go a long way to avoid complexity wherever possible, so that we can focus on solving problems of our customers, instead of problems we manufactured for ourselves. The way we manage configuration follows this philosophy. Firstly, it’s based on simple YAML files, divided into two directories: default and local. Default config is shipped with the product, and depends on the Stream version currently used. Local config holds customer-specific configuration. These directories function as layers, i.e., local content overrides default content whenever it is defined. Want to know more about Stream configuration files? Sorry, we didn’t think documenting this would be important. Just kidding, click here.
Why am I telling you this? Well, because to implement Packs we needed to expand on this mechanism a bit – we decided that Packs should have their own default and local layer, and fall back to the global “cribl” layer only if neither of them is present.
The main benefit of this approach is that it doesn’t introduce any behavioral changes into the existing codebase, instead, we only build upon existing behavior. That means not only less work for us, but also smaller regression risk. Yay! What could go wrong?
The key feature of Packs is the ability to share them. We want customers to be able to export Packs they created, and install and import Packs created by others. Specifically, we needed a way to upgrade existing Packs to their newer versions. To some extent, the behavior of upgrading Packs can follow the generic case of upgrading Stream – default configuration gets overwritten with newer one. But if the existing Pack has local changes, and so does the newer one, what to do then?
On one hand, we wanted customers to be able to have exactly the same configuration if they installed the same Pack. On the other, we should respect any local changes the customer introduced to their Pack, prior to upgrading. And since Stream has never “magically” overwritten their local changes, doing so now would go against any expectation we wanted them to retain while using the product!
Before we tackle the more complicated cases, let’s provide solutions for the simpler ones. For example, the “our local” vs “their local” conflict doesn’t exist if the package we’re upgrading with has only default changes. So if we created a way to export Packs with only the default layer, we would address part of the problem without adding any complexity. This is how default only export mode was born.
Another piece of low-hanging fruit comes with the realization that if the Pack’s local and default layers do not intersect, exporting them together as a single default layer boils down to just copying files. That’s what merge safe export mode does – it checks if the Pack’s local config overrides any defaults, and fails if it does. If there are no overrides, we export a nice Pack with an “interwoven” configuration.
So far so good, right? But now we have a harder nut to crack! What to do when local configuration overrides default? This will actually be a fairly common case: import a Pack, customize it, and then export. Initially, we considered a full export mode, which would take both layers as is, but while this seemed the simplest approach, it led to some serious complications down the line. Should upgrading a Pack with local content overwrite your existing local content? What about installing the same version over a customized Pack: Should the behavior be the same? We couldn’t find good answers to these questions, so in the end, we decided to just scrap full export mode. Thankfully, there was a sensible alternative.
Ideally, we wanted export to merge the layers together, using local whenever it was defined, falling back to default otherwise. However, we were skeptical about this approach, because we thought the implementation would introduce complexity and be bug-prone. However, having exhausted alternatives, we thought about it some more, and realized that our existing codebase already merges these layers without issues! After all, that’s how Stream has always read configuration files – merge local changes into the default. So, to implement merge export mode, all we had to do was to load the configuration and save it in the exported directory. Naturally, there were some hiccups along the way, but we avoided major pitfalls
In the end, we solved the exporting problem the same way we solved the more general configuration layering problem – by reusing and slightly expanding, an already existing and battle-tested behavior.
I hope you enjoyed this article. If you’re still not bored and want to know more about how to export Packs, check out our documentation!
The fastest way to get started with Cribl Stream is to sign-up at Cribl.Cloud. You can process up to 1 TB of throughput per day at no cost. Sign-up and start using Stream a few minutes.