x

Cribl Self Guided Trials​

Enhance Data Analytics Tool Performance

1. Understand the Problem

Goal:
Enhance data analytics tool performance

Challenge:
Run your searches faster, with fewer resources

Example:
Your data analytics searches take too long, are skipping, or are not completing at all. These searches are important and you wish they took seconds. You’ve been asked to add additional data sources for more visibility and troubleshooting or for security investigations, but your analytics tool’s resources are near capacity. Adding additional data is just going to slow your searches even more — to the point that they are not useful.

How Can Cribl Help?

Cribl Stream can transform, reduce and extract relevant fields, accelerate data ingest, dramatically increase performance with less compute and storage resources.

To do this, you will test and deploy several Cribl Stream technical use cases:

  • Reduction: Send only the relevant data to your system of analysis giving you up to 30% additional license headroom and a lower compute and storage costs.
  • Optimization: Enrich data from third-party sources for improved context and accelerated analysis (while negating the need to ingest some sources like DNS). Normalization to prepare the data for expected destination schema (ie. Splunk Common Information Model or Elastic Common Schema) to reduce the overhead on preparing the data.
  • Logs to Metrics: Conversion and aggregation can send only the important metrics to the system of analysis and route the noise to low cost storage for faster searches while providing additional room to onboard new data sources.

Before You Begin:

  • Review the following Cribl Sandboxes:
  • You’ll use Cribl.Cloud https://docs.cribl.io/stream/deploy-cloud for your QuickStart so you might want to note the following:
    • Make sure you choose the correct region, either US West (Oregon) or US East (Virginia), to ensure the Cribl Stream workers are closest to the point of egress to lower costs. (It’s also wicked hard to change it later.)
    • Cribl.Cloud Free/ Standard does not include SSO.
    • Cribl.Cloud Free/ Standard does not support hybrid deployments. If you need to test on-premises workers, please request an Enterprise trial using the Chatbot below.
    • Cribl Packs are out of the box solutions for given technologies. You can combine Cribl.Cloud with sample data from a Pack. This combination may be enough for you to prove Cribl Stream will work to help you route data to analytics tools and low-cost storage. The following Packs might be helpful:
      • AWS VPC Flow for Security Teams
      • Cisco ASA
      • Cribl-AWS-Cloudtrail-logs
      • Cribl Pack for Nix
      • Microsoft Windows Events
      • Palo Alto Networks
      • Splunk UF Internal Pack

What You’ll Achieve:

  • You’ll complete 3 technical use cases to support your business case.
    • A business case is the business outcome you’re trying to achieve. The technical use cases you create will illustrate how Cribl features will work in your environment. Typically, you will need multiple technical use cases to achieve your business case.
  • You’ll connect 1-2 sources to 1-2 destinations.
  • You’ll show that in your environment, with your data sources, you can:
    • Eliminate duplicate or unnecessary fields from your data before ingestion
    • Convert logs to metrics in the stream to speed dashboard and search performance
    • Preprocess data before ingestion to onboard new data sources more quickly and easily

2. Implementation Overview

Identify the data being sent to each destination. For each type of data you will accomplish the following:

  • Create a route with a suitable filter to match that data set
  • If data modifications are needed, create a pipeline and configure functions as necessary.
  • Set the routes to send the data to the appropriate destination

For data that requires modification or reduction, create a pipeline or use the out-of-the-box Packs. (Most Packs help to reduce data volumes by up to 30%.)

  • Capture sample data to be used with the pipeline
  • Create a pipeline to reduce and manipulate the data. Use the Eval, Drop, Suppress, or Sample functions for data reduction
  • Keep on modifying and compare the savings you desire with Cribl’s basic statistics UI (See: A Second Look at Our Data)

For data that we can modify Logs to Metrics, create a pipeline or use the out-of-the-box Packs to generate metrics:

  • Capture sample data to be used with the pipeline
  • Create a pipeline to transform and manipulate the data. Use the Publish Metrics, Aggregation, or Numerify functions
  • Keep on modifying and compare the savings you desired with Cribl’s basic statistics UI (See: A Second Look at Our Data)

For data that we can Enrich from third-party sources to get better context and speed analysis:

  • Capture sample data to be used with the pipeline
  • Create a Lookup Table
  • Create a pipeline to Enrich data. Use the Eval, Lookup Table. Potentially use the out-of-the-box Packs to see examples of Enrichment
  • Keep on modifying until the data OUT view to matches what you’d expect to see

For data that you want to Transform to prepare the data with Common Information Model fields (CIM for Splunk) or Elastic Common Schema (ECS for Elastic) to reduce the overhead on preparing the data:

  • Capture sample data to be used with the pipeline
  • Create a pipeline to Transform data. Use the Eval, Rename, Auto Timestamp, Parser, Serialize, or Mask functions.
  • Keep on modifying until the data OUT view matches what you’d expect to see

3. Select Your Data Sources

  • For ease of setup, we recommend you choose from the list of sources supported by the Packs listed in Before You Begin. For the full list of supported sources, see: https://docs.cribl.io/stream/sources/ . The category of vendors Cribl works with: Cloud, Analytics tools, SIEM, Object Store, and many more options: https://cribl.io/integrations/
  • Choose from supported formats and source types: JSON, Key-Value, CSV, Extended Log File Format, Common Log Format, and many more out of the box options. See our library for the full list.

Spec out each source:

  1. What’s the volume of that data source per day? (Find in Splunk | Find in Elastic)
  2. Is it a cloud source or an on-prem source?
  3. Do you need TLS, Certificates, or Keys to connect to the sources?
  4. What protocols are supported by both the source and by Cribl Stream?
  5. During your QuickStart, do you have access to the source from production, from a test environment, or not at all?

For your QuickStart, we recommend no more than 3 Sources

Source
Source Collection method
Volume
Cribl Worker Node Host Name / IPs / Load Balancer
Configuration Notes, TLS, Certificates

4. Select Your Destinations

Where does your data need to go to:

  1. The list of supported destinations: https://docs.cribl.io/stream/destinations
  2. The category of vendors Cribl works with: Search Engine, Object Store, SIEM, UEBA, EDR, and many more options: https://cribl.io/integrations/
  3. What volume of data do you expect to send to that destination per day?
  4. Is it a cloud or on-prem destination?
  5. Do you need TLS, Certificates, or Keys to connect to the destination?
  6. What protocols are supported by both the destination and by Cribl Stream?
  7. During your QuickStart, will you be sending data to production environments, test environments, or not at all?

For your QuickStart, we recommend no more than 3 Destinations

Destination
Destination Sending method
Volume
Destination Host Name / IPs / Load Balancers
Configuration Notes, TLS, Certificates

5. Prepare Your QuickStart Environment

  1. Regardless of where your data resides, the fastest way to prove success with Cribl is with Cribl.Cloud. With Cribl.Cloud, there is no infrastructure to spin up and manage and no hardware to deal with. You can get straight to your observability pipeline planning, giving you greater control and choice over your data in our SOC 2-compliant Cribl.Cloud.
  2. Cribl.Cloud supports up to 1TB/day–the perfect capacity for a Cribl QuickStart. Once the evaluation is done and you are ready for production, you can increase the volume and allocate more capacity.

6. Launch the QuickStart by Registering for Your Free Cribl.Cloud Instance

  1. Once you’ve registered on the portal, sign in to Cribl.Cloud.
  2. Select the Organization you want to work with.
  3. From the portal page, select Manage Stream.
  4. The Cribl Stream interface will open in a new tab or window – and you’re ready to go!
  5. Notice the Cribl.Cloud link in the upper left of the Cribl.Cloud home page, under the Welcome message. Click this link at any time to reopen the Cribl.Cloud portal page and all its resources.
    1. Follow the getting started Cribl.Cloud-hosted instance documentation
      https://docs.cribl.io/stream/deploy-cloud#getting-started
    2. Examine the available out-of-the-box cloud ports
      https://docs.cribl.io/stream/deploy-cloud/#ports-certs

7. Configure Cribl Sources and Destinations

As part of the exercise to prove your use case, we recommend you limit your evaluation to 3 sources and 3 destinations (or fewer). Note: for an alternative to setting up Sources and Destinations you can use Cribl packs and sample data for your evaluation. Look at steps 8 and 9 to use Packs and the included Sample data.
  1. Configure Destinations first. Configure destinations one at a time. For each specified destination:
    1. Configure destinations one at a time. For each specified destination:
      1. After configuring the Cribl destination, , reopen its config modal, select the Test tab, and click Run Test. Look for Success in the Test Results.
      2. At the destination itself (AWS S3, Azure Blob Storage, Google Cloud Storage), validate that the sample events sent through Cribl have arrived. For example, in AWS you can log into the management console and navigate through the S3 interface to find the Cribl generated files. https://docs.aws.amazon.com/AmazonS3/latest/userguide/download-objects.html
  2. Configure Sources. Configure sources one at a time. For each specified source:
    1. Configure the sources in Cribl https://docs.cribl.io/stream/sources (For Distributed Deployment, remember to click Commit / Deploy button after you configure each Source to ensure it’s ready to use in your pipeline.)
    2. Note: If you need to test a hybrid environment–you will need to request an Enterprise trial entitlement. Use the chatbot below to make your request. Also note: there is not an automated way to transfer configurations across Free instances and Enterprise trials. Once you’re squared away with an Enterprise entitlement you can test hybrid deployments.
    3. Hybrid Workers (meaning, Workers that you deploy on-premises, or in cloud instances that you yourself manage) must be assigned to a different Worker Group than the Cribl-managed default Group – which can contain its own Workers:
      1. On all Workers’ hosts, port 4200 must be open for management by the Leader.
      2. On all Workers’ hosts, firewalls must allow outbound communication on port 443 to the Cribl.Cloud Leader, and on port 443 to https://cdn.cribl.io.
      3. If this traffic must go through a proxy, see System Proxy Configuration for configuration details.
      4. Note that you are responsible for data encryption and other security measures on Worker instances that you manage.
      5. See the available Source ports under Available Ports and TLS Configurations here
    4. For some Sources, you’ll see example configurations to send data to Cribl Worker nodes (on-prem and/or Cloud) at the bottom.
    5. Test that Cribl Stream is receiving data from your source.
      1. After configuring the Cribl Source and configuring the source itself (Syslog, Splunk Universal Forwarder, Elastic Beats, etc.), go to the Live Data tab and ensure that your results are coming into Cribl.Cloud.
      2. In some cases, you may want to change the time period for collecting data. Go to the Live Data tab, click Stop, then change Capture Time to 600. Click Start. This will give you more time to test sending data into Cribl.

8. Configure Cribl QuickConnect or Routes

Another way you can get started quickly with Cribl is with QuickConnect or Routes.

Cribl QuickConnect lets you visually connect Cribl Stream Sources to Destinations using a simple drag-and-drop interface. If all you need are independent connections that link parallel Source/Destination pairs, Cribl Stream’s QuickConnect rapid visual configuration tool is a useful alternative to configuring Routes.

For maximum control, you can use Routes to filter, clone, and cascade incoming data across a related set of Pipelines and Destinations. If you simply need to get data flowing fast, use QuickConnect.

  1. Use QuickConnect to route your Source to your Destination.
    1. Configure QuickConnect.
    2. Initially, you may want to use the passthrough pipeline. This pipeline does not manipulate any data.
    3. Test your end-to-end connectivity by selecting Source -> Cribl Source -> Cribl QuickConnect -> Cribl Destination -> Destination.
  2. Alternatively, use Routes to route your Source to a Destination.
    1. Configure Routes with a suitable filter to match that data set
    2. Initially, you may want to use the passthrough pipeline. This pipeline does not manipulate any data.
    3. If data modifications are needed, create a pipeline and configure functions (like Eval, Drop, Suppress, Rename, Parser, Aggregations, etc.) as necessary, following the docs to configure appropriate Output Settings (ie Metrics Mode).
    4. Test your end-to-end connectivity by selecting Source -> Cribl Source -> Cribl Routes -> Cribl Destination -> Destination.

9. Capture Sample Files

Capture Sample Data set for each Sourcetype.

Capturing a sample data set allows Cribl Pipeline and Packs to validate the logic against your sample data, and show a before and after view to prove that your Reduction and Enrichment use cases are working.

  1. At the Source, go to Live Data -> Save a sample file (lower left corner).
  2. Rename the file and add the Sourcetype as part of the name.
  3. Select Remove the Expiration in 24 hours (that will make sure you have the sample file forever).
  4. Click Save.

As an alternative to capturing sample data at the Source, use QuickConnect to capture a sample dataset.

In the QuickConnect UI, the right Capture button captures a sample of data flowing through the Source.

  1. Click on Capture sample data.
  2. Rename the file and add the Sourcetype as part of the name.
  3. Select Remove the Expiration in 24 hours (that will make sure you have the sample file forever).
  4. Click Save. (For Distributed Deployment, remember to click Commit / Deploy.)

As an alternative to capturing sample data at the Source, use Routes to capture a sample dataset:

  1. At the Route, click a Route’s Options (…) menu.
  2. Click Capture sample data.
  3. Rename the file and add the Sourcetype as part of the name.
  4. Remove Expiration in 24 hours (that will make sure you have the sample file forever).
  5. Click Save.

10. Create the Pipeline

For your use cases you will:

Streamline the number of fields or volume of data you send to your analysis tool:

  • Capture sample data to be used with the Pipeline
  • Create a Pipeline to reduce and manipulate the data. Use the Eval, Drop, Suppress, or Sample functions to streamline your data
  • Keep on modifying and compare the savings you desired with Cribl’s basic statistics UI (See a Second Look at Our Data)

Modify Logs to Metrics:

  • Capture sample data to be used with the pipeline
  • Create a pipeline to transform and manipulate the data. Use the Publish Metrics, Aggregation, or Numerify functions
  • Keep on modifying and compare using the IN and OUT view

Enrich data with third-party sources:

  • Capture sample data to be used with the pipeline
  • Potentially create a Lookup Table or run a Rest API collector to populate a Lookup Table
  • Create a pipeline to Enrich data. Use the Eval, Lookup Table, or Redis functions. Potentially use the out-of-the-box Packs to see examples of Enrichment or creating Common Information Model in Cribl
  • Keep on modifying until the data OUT view matches what you’d expect to see

Transform data to prepare it with Common Information Model fields (for Splunk) or Elastic Common Schema (ECS for Elastic):

  • Capture sample data to be used with the pipeline
  • Create a pipeline to Transform data. Use the Eval, Rename, Auto Timestamp, or Mask functions.
  • Keep on modifying until the data OUT view matches what you’d expect to see

Packs enable Cribl Stream administrators to pack up and share Pipelines and Functions across organizations, and include sample data for testing. The following Packs might be helpful:

  • AWS VPC Flow for Security Teams
  • Cisco ASA
  • Cribl-AWS-Cloudtrail-logs
  • Cribl-Carbon-Black
  • Cribl-Fortinet-Fortigate-Firewall
  • Cribl Pack for Nix
  • CrowdStrike Pack
  • Microsoft Office Activity
  • Microsoft Windows Events
  • Palo Alto Networks
  • Splunk UF Internal Pack
  1. Select any relevant Packs.
    1. Choose up to 3 packs for your data sources and add each Pack. Click on the Pack listing in the Dispensary or consult the internal README.md for the benefits each pack provides.
  2. Use Pack sample data
    1. Take any necessary additional steps to configure the Pack. Note that “Built by Cribl” Packs are supported via Cribl Community Slack. Community-contributed Packs are best effort support. All Packs are validated prior to being publicly listed.
    2. Take a look at the Pack’s Pipelines, each of which has a corresponding sample file.
      1. With a Pipeline selected, open the sample file in Simple Preview mode.
      2. To see the before and after for that dataset, click Sample Data In / Sample Data Out.
      3. To view information about volume reduction, event reduction, and more, click Basic Statistics.
    3. Modify your QuickConnect or Route to include the Pack you want to add to Cribl.Cloud.

As an alternative to Packs and the out-of-the-box Pipelines that are part of the Packs, you can create your own Pipeline. Pipelines are Cribl’s main way to manipulate events. Examine Cribl Tips and Tricks for additional examples and best practices. Look for all the sections that have Try This at Home for Pipeline examples https://docs.cribl.io/stream/usecase-lookups-regex/#try-this-at-home

  1. 1) Download the Cribl Knowledge Learning Pack from the Pack Dispensary for more cool Pipeline samples.
    2) Examine best practice links (for example, the Syslog best practices.
  2. Add a new pipeline, named after the dataset it will be used to process.
  3. Use the sample dataset you captured as you built your Pipeline.
  4. Add or edit functions to reduce, enrich, redact, aggregate, or shape your data as needed.Confirm the desired results in the Out view and basic statistics UI.
  1. Add a new pipeline, named after the dataset it will be used to process.
  2. Use the sample dataset you captured as you built your Pipeline.
  3. Add or edit functions to reduce, enrich, redact, aggregate, or shape your data as needed. Confirm the desired results in the Out view and basic statistics UI.

11. Review Your Results

  1. Repeat any of the above steps until the Source, QuickConnect, Routes, Pipelines, and Destinations are supporting your business case.
  2. Determine if you achieved your testing goals for this dataset, and note your results.
  3. Finally, summarize your findings. Common value our customers see include:
    1. Cost savings (infrastructure, license, cloud egress).
    2. Optimized analytics tools (prioritizing relevant data accelerates search and dashboard performance).
    3. Future-proofing (enabling choice and mitigating vendor-lock).

Please note: If you are working with existing data source being sent to your downstream systems and you do nothing to the output from Cribl Stream, it has a may break any existing dependencies on the original format of the data. Be sure to consult this Best Practices blog or the users and owners of your downstream systems before committing any data source to a destination from within Cribl Stream.

Technical Use Cases Tested:

  • Routing: Here’s how to see your results: To confirm data flow through the whole system we’ve built, select Monitoring > Data > Routes and examine demo.

    Also select Monitoring > Data > Pipelines and examine slicendice.

For additional examples, see:

When you’re convinced that Stream is right for you, reach out to your Cribl team and we can work with you on advanced topics like architecture, sizing, pricing, and anything else you need to get started!