x

Cribl Search: The Most Powerful Tool for Querying Data at Its Source

Written by Perry Correll

November 29, 2022

One of the most useful features of Cribl’s flagship solution Stream is its ability to separate the wheat from the chaff in your data’s journey from source to destination — Stream allows you to control what data goes to what system, Cribl Search, takes this to the next level by controlling what data should be collected before it is ever put in motion.

Eliminate the Need to Move Data Query It at Rest With Cribl Search

With Search, you’re able to query the data at the source before moving it or after it’s been collected and is at-rest at a destination.

Typically, a host generates the data and a collection of agents collect and route the data to some holding location where it’s to be queried, whether it’s an AWS bucket or an indexer from any other vendor out there. This process is time-consuming, can take a big chunk of money out of an organization’s budget and usually collects a lot of useless information. So we thought we’d take another page out of one of the oldest playbooks out there to tackle the problem.

Using Cribl Search, You Collect, Store, and Index Only Critical Data

Whether you’re in the wheat business or the observability data space, you’ll inevitably end up with some nonessentials during processing that you’ll end up tossing out. When wheat is harvested, it’s separated from the stems and the chaff in the field before it’s transported anywhere, getting rid of the burden associated with collecting and transporting anything that would eventually end up getting the boot anyway.

With Search, we’re giving you that same option in the form of a search tool that can actually query at its source, so that only critical data can be filtered, shaped, aggregated, and routed to the appropriate storage or system of analysis. Search gives you a unique kind of access to system, host, or container data that was not originally instrumented or engineered to support a search function. With this newfound power, the applications and use cases are endless.

The Advantages of Querying Data at the Host

Imagine wanting to examine data on Linux or Windows systems that were never instrumented to support a search function, whether it’s one such system or a thousand. You can use your favorite observability pipeline — Cribl Stream, of course — to instantly create a script that can automatically launch an Edge node on a targeted system in minutes, providing you the ability to search, locate, shape, and route any targeted data to your destination of choice.

You also have easy access to application and system logs, system status information, metrics, system files, and configuration files — which is all possible without any data movement at all. If you want to pull even more functionality out of Search, you can use Edge capabilities to automatically execute commands on these systems and record the results.

Want to actually get into the host without dealing with SSH, Search allows you to teleport directly into systems to gather and view all application processes and system calls, all from a single UI. Imagine how much this capability could increase the scope of your analysis. How much more efficient would your data collection system and your searches be?

Coming up in the series, we’ll talk about using Search to query data where it’s being stored, and then while it’s in motion in your observability pipeline. If you want to get into the details and see it in action, come to our on-demand webinar.

Cribl Search Blog Series

Questions about our technology? We’d love to chat with you.