Cribl offers a suite of tools designed to optimize data pipelines, with different components tailored for managing and orchestrating data flows at scale across different teams and data sources. One of the biggest problems with building and running a multi-team data engine is isolation. This blog will cover how we at Cribl have handled the challenge of data, configuration, and access isolation for growing teams.
This blog will cover three features: Cloud Workspaces, Worker Groups, and Stream Projects, and how they provide isolation and crafted experiences for different levels of data access and control.
I started at Cribl a few months after the concept of Worker Groups was introduced into what was then called Cribl LogStream (I’m dating myself here, as we dropped the log back in 2022). Before that, a single node was processing our customers’ data. As with all product investments at Cribl, we start with the job to be done. For the remainder of this blog, I will use this model to explain how each capability in the portfolio should be used as defined by the problem it solves and the job it intends to do for our customers.
The job was: “Give me a way to isolate data at the network, compute, and storage layer to provide secure access to data for my different teams.” With Worker Groups, we also introduced a new persona into the Cribl world – the Group Admin. This user is responsible for all things within a specific group but not the administrator of Cribl in total or even other groups. Isolation achieved!
A few years later, another problem statement emerged: “I want to give access to members of my teams who don’t need to know or don’t have time to learn all of the ins and outs of the Cribl infrastructure. All they need to do is build and test pipelines.” The job to be done was: “Give Cribl Admins a way to give access to streams of data to their data experts without revealing or giving access to superuser/admin capabilities at the group level.” This provides more isolation and a tailored experience for the Data Expert persona.
The data expert just needs access to their part of the data. You could spin up a new Worker Group and grant them group admin access, but then they would have to learn every part of the solution to manage their part of the data. Additionally, many data sources are multiplexed, meaning that they can contain multiple sources and datatypes – S3 pulls alone can have data that can be used by multiple teams. A project allows for all of that data to be managed by one Worker Group but routed to unique Projects by Group for use by the appropriate data experts.
Lately, another problem statement has been coming up, specifically in the Cloud: “I need to have multiple unique environments with their own configurations, access rules, team members, and network isolation for providing full data management to unique parts of my organization.”
This last model is very common in state and federal governments as well as multinational corporations, where Business Units may have completely different teams of Super Admin, Group Admin, and Data Experts but would still like to have one place where billing and metering are combined to share a spend with Cribl. The job here is to provide an on-demand way of building new environments that is still under the banner of the main organization.
Cribl.Cloud introduced Workspaces to give customers an on-demand full environment isolation that provides:
Cribl.Cloud Workspaces are ideal for organizations looking to leverage the power of Cribl products across dedicated environments while maintaining the benefits of centralized management, administration, and billing. They offer the benefits of isolation, security, and scalability for customers who require multiple unique enterprise environments.
Worker Groups in Cribl Stream are clusters of worker nodes that process your data. Worker Groups are fundamental to isolating and securing data collection and processing.
These groups provide:
Stream Projects in Cribl is a relatively newer feature that allows for the most granular and isolated organization and management of end-to-end data flows for teams working with similar data sources. Projects enable:
Cribl’s product architecture is designed to facilitate multi-team engagement through varying degrees of data and configuration isolation with the added benefit of providing additional data security and scalability.
You can Isolate entire environments with Workspaces in Cribl.Cloud, Isolate Teams within environments with Stream Worker Groups, and finally, isolate specific data feeds and workflows with Stream Projects.
Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.
We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.
Experience a full version of Cribl Stream and Cribl Edge in the cloud with pre-made sources and destinations.