Cribl QuickStart: Accelerate Cloud Migration

  • 1. Understand the Problem
  • 2. Implementation Overview
  • 3. Select Your Data Sources
  • 4. Select Your Destinations
  • 5. Prepare Your QuickStart Environment
  • 6. Launch the QuickStart by Registering for Your Free Cribl.Cloud Instance
  • 7. Configure Cribl Sources and Destinations
  • 8. Configure Cribl QuickConnect or Routes
  • 9. Capture Sample Files
  • 10. Create the Pipeline
  • 11. Review Your Results

1. Understand the Problem

Goal: Confidently migrate existing applications and tooling to the cloud (or to multiple clouds) on time and under budget.
Challenge: Reconfiguring architectures and data flows to ensure parity and visibility in the cloud (or multiple clouds), while keeping a handle on ingress and egress charges.

Example:
You are migrating a widely-deployed application from an on-premises deployment to a cloud deployment, with the primary goals for migration being optimized performance, reduced management overhead and streamlined costs. This is your opportunity to address some ongoing challenges of your on-premises deployment, and ensure parity between your old deployment and new cloud deployment before fully switching over.

How Can Cribl Help?

By routing your data from existing sources to multiple destinations, you can ensure data parity in your new cloud destinations, before turning off your on-premises (or legacy) analytics, monitoring, storage or database products and tooling. Further, Cribl can reduce your costs significantly by putting our worker nodes inside of your cloud, to cheaply and effectively compress and move the data to reduce egress charges.

To do this, you will test and deploy several Cribl Stream technical use cases:

  • Routing: Route data to multiple destinations for analysis and/ or storage. This gives teams a level of comfort that they can ensure parity between on-prem and cloud deployments and reduce egress charges across zones and clouds–with the added bonus of accelerated data onboarding, with normalization and enrichment in the stream.
  • Normalization: Prepare the data for expected destination schema ie. Splunk Common Information Model (CIM) or Elastic Common Schema (ECS) to reduce the overhead on preparing and tagging the data.
  • Reduction: Send only the relevant data to your cloud tools to free license headroom and a reduction in required infrastructure (Many Cribl customers report 30%+ reductions on both counts). As an added benefit, with only relevant data going into your destinations, you’ll enhance performance across searches, dashboard loading and more.

Before You Begin:

  • Review the following Cribl Sandboxes:
  • You’ll use Cribl.Cloud https://docs.cribl.io/stream/deploy-cloud for your QuickStart so you might want to note the following:
    • Make sure you choose the correct region, either US West (Oregon) or US East (Virginia), to ensure the Cribl Stream workers are closest to the point of egress to lower costs. (It’s also wicked hard to change it later.)
    • Cribl.Cloud Free/ Standard does not include SSO.
    • Cribl.Cloud Free/ Standard does not support hybrid deployments. If you need to test on-premises workers, please request an Enterprise trial using the Chatbot below. Also note: there is not an automated way to transfer configurations from Enterprise trials to Free instances.
    • Cribl Packs are out of the box solutions for given technologies. You can combine Cribl.Cloud with sample data from a Pack. This combination may be enough for you to prove Cribl Stream will work to help you route data to analytics tools and low-cost storage. The following Packs might be helpful:
      • AWS VPC Flow for Security Teams
      • Cisco ASA
      • Cribl-AWS-Cloudtrail-logs
      • Cribl-Carbon-Black
      • Cribl-Fortinet-Fortigate-Firewall
      • Cribl Pack for Nix
      • CrowdStrike Pack
      • Microsoft Office Activity
      • Microsoft Windows Events
      • Palo Alto Networks
      • Splunk UF Internal Pack

What You’ll Achieve:

  • You’ll complete 3 technical use cases to support your business case.
    • A business case is the business outcome you’re trying to achieve. The technical use cases you create will illustrate how Cribl features will work in your environment. Typically, you will need multiple technical use cases to achieve your business case.
  • You’ll connect 1-2 sources to 1-2 destinations.
  • You’ll show that in your environment, with your data sources, you can:
    • Route data to multiple locations to ensure data flow and parity between your existing on-prem implementation and your new cloud implementation
    • Parse out duplicate and unnecessary fields from data sources before they are ingested in your analytics tools
    • Preprocess data before ingestion to onboard new data sources more quickly and easily

2. Implementation Overview

From your existing collectors and agents, set up destinations and pipelines for your new cloud destinations. (If you need new collectors or agents, you can learn more about Cribl Edge, a vendor-neutral, small footprint agent, which allows you to configure which data you want to send from the edge to your destination. Edge also provides a clean UI to ease fleet management.)

    Identify the data being sent to each destination. For each type of data you will accomplish the following:
    • Create a route with a suitable filter to match that data set
    • If data modifications are needed, create a pipeline and configure functions as necessary.
    • Set the routes to send the data to the appropriate destination
      1. For data that requires shaping or normalization, create a pipeline or use the out-of-the-box Packs.

        • Capture sample data to be used with the pipeline
        • Create a pipeline to reduce and manipulate the data. Use the Auto Timestamp, Event Breaker, Flatten, or Rename functions to modify fields
        • Keep on modifying until the data OUT view matches what you’d expect to see

For data that requires reduction, create a pipeline or use the out-of-the-box Packs. (Most Packs help to reduce data volumes by up to 30%.)

    • Capture sample data to be used with the pipeline
    • Create a pipeline to reduce and manipulate the data. Use the Eval, Drop, Suppress, or Sample functions for data reduction
    • Keep on modifying and compare the savings you desire with Cribl’s basic statistics UI (See: A Second Look at Our Data)

3. Select Your Data Sources

  • For ease of setup, we recommend you choose from the list of sources supported by the Packs listed in Before You Begin. For the full list of supported sources, see: https://docs.cribl.io/stream/sources/ . The category of vendors Cribl works with: Cloud, Analytics tools, SIEM, Object Store, and many others: https://cribl.io/integrations/
  • Choose from supported formats and source types: JSON, Key-Value, CSV, Extended Log File Format, Common Log Format, and many more out of the box options. See our library for the full list.

Spec out each source:

  1. What’s the volume of that data source per day? (Find in Splunk | Find in Elastic)
  2. Is it a cloud source or an on-prem source?
  3. Do you need TLS, Certificates, or Keys to connect to the sources?
  4. What protocols are supported by both the source and by Cribl Stream?
  5. During your QuickStart, do you have access to the source from production, from a test environment, or not at all?

For your QuickStart, we recommend no more than 2 Sources.

Source
Source Collection method
Volume
Cribl Worker Node Host Name / IPs / Load Balancer
Configuration Notes, TLS, Certificates

4. Select Your Destinations

Where does your data need to go:

  1. Choose from the list of supported destinations: https://docs.cribl.io/stream/destinations
  2. Cribl works with the following vendor categories: Cloud, Search Engine, Object Store, SIEM, APM, and many more options: https://cribl.io/integrations/
  3. What volume of data do you expect to send to that destination per day?
  4. Is it a cloud or on-prem destination?
  5. Do you need TLS, Certificates, or Keys to connect to the destination?
  6. What protocols are supported by both the destination and by Cribl Stream?
  7. During your QuickStart, will you be sending data to production environments, test environments, or not at all?

For your QuickStart, we recommend no more than 2 Destinations.

Destination
Destination Sending method
Volume
Destination Host Name / IPs / Load Balancers
Configuration Notes, TLS, Certificates

5. Prepare Your QuickStart Environment

  1. Regardless of where your data resides, the fastest way to prove success with Cribl is with Cribl.Cloud. With Cribl.Cloud, there is no infrastructure to spin up and manage and no hardware to deal with. You can get straight to your observability pipeline planning, giving you greater control and choice over your data in our SOC 2-compliant Cribl.Cloud.
  2. Cribl.Cloud supports up to 1TB/day–the perfect capacity for a Cribl QuickStart. Once the evaluation is done and you are ready for production, you can increase the volume and allocate more capacity.

6. Launch the QuickStart by Registering for Your Free Cribl.Cloud Instance

  1. Once you’ve registered on the portal, sign in to Cribl.Cloud.
  2. Select the Organization you want to work with.
  3. From the portal page, select Manage Stream.
  4. The Cribl Stream interface will open in a new tab or window – and you’re ready to go!
  5. Notice the Cribl.Cloud link in the upper left of the Cribl.Cloud home page, under the Welcome message. Click this link at any time to reopen the Cribl.Cloud portal page and all its resources.
    1. Follow the getting started Cribl.Cloud-hosted instance documentation https://docs.cribl.io/stream/deploy-cloud#getting-started
    2. Examine the available out-of-the-box cloud ports https://docs.cribl.io/stream/deploy-cloud/#ports-certs

7. Configure Cribl Sources and Destinations

As part of the exercise to prove your use case, we recommend you limit your evaluation to 1-2 sources and 1-2 destinations (or fewer).

Note: for an alternative to setting up Sources and Destinations you can use Cribl packs and sample data for your evaluation. Look at step 9 to use Packs and the included Sample data.

  1. Configure Destinations first. Configure destinations one at a time. For each specified destination:
    1. Test that Cribl Stream is sending data to your destination:
      1. After configuring the Cribl destination, reopen its config modal, select the Test tab, and click Run Test. Look for Success in the Test Results.
      2. At the destination itself (Splunk UI, ElasticSearch, Exabeam, etc.), validate that the sample events sent through Cribl have arrived. (For example, in the Splunk  search bar you can run ‘index=main cribl_test=”*”’) 
  2. Configure Sources. Configure sources one at a time. For each specified source:
    1. Configure the sources in Cribl https://docs.cribl.io/stream/sources (For Distributed Deployment, remember to click Commit / Deploy button after you configure each Source to ensure it’s ready to use in your pipeline.)
    2. Note: If you need to test a hybrid environment–you will need to request an Enterprise trial entitlement. Use the chatbot below to make your request. Once you’re squared away with an Enterprise entitlement you can test hybrid deployments.
    3. Hybrid Workers (meaning, Workers that you deploy on-premises, or in cloud instances that you yourself manage) must be assigned to a different Worker Group than the Cribl-managed default Group – which can contain its own Workers:
      1. On all Workers’ hosts, port 4200 must be open for management by the Leader.
      2. On all Workers’ hosts, firewalls must allow outbound communication on port 443 to the Cribl.Cloud Leader, and on port 443 to https://cdn.cribl.io.
      3. If this traffic must go through a proxy, see System Proxy Configuration for configuration details.
      4. Note that you are responsible for data encryption and other security measures on Worker instances that you manage.
      5. See the available Source ports under Available Ports and TLS Configurations here.
    4. For some Sources, you’ll see example configurations to send data to Cribl Worker nodes (on-prem and/or Cloud) at the bottom.
    5. Test that Cribl Stream is receiving data from your source.
      1. After configuring the Cribl Source and configuring the source itself (Syslog, Splunk Universal Forwarder, Elastic Beats, etc.), go to the Live Data tab and ensure that your results are coming into Cribl.Cloud.
      2. In some cases, you may want to change the time period for collecting data. Go to the Live Data tab, click Stop, then change Capture Time to 600. Click Start. This will give you more time to test sending data into Cribl.

8. Configure Cribl QuickConnect or Routes

Another way you can get started quickly with Cribl is with QuickConnect or Routes.

Cribl QuickConnect lets you visually connect Cribl Stream Sources to Destinations using a simple drag-and-drop interface. If all you need are independent connections that link parallel Source/Destination pairs, Cribl Stream’s QuickConnect rapid visual configuration tool is a useful alternative to configuring Routes.

For maximum control, you can use Routes to filter, clone, and cascade incoming data across a related set of Pipelines and Destinations. If you simply need to get data flowing fast, use QuickConnect.

  1. Use QuickConnect to route your Source to your Destination.
    1. Configure QuickConnect.
    2. Initially, you may want to use the passthrough pipeline. This pipeline does not manipulate any data.
    3. Test your end-to-end connectivity by selecting Source -> Cribl Source -> Cribl QuickConnect -> Cribl Destination -> Destination.
  2. Alternatively, use Routes to route your Source to a Destination or Devnull.
    1. Configure Routes.
    2. Initially, you may want to use the passthrough pipeline. This pipeline does not manipulate any data.
    3. If data modifications are needed, create a pipeline and configure functions (like Eval, Drop, Suppress, Rename, etc.) as necessary.
    4. Test your end-to-end connectivity by selecting Source -> Cribl Source -> Cribl Routes -> Cribl Destination -> Destination.

9. Capture Sample Files

Capture Sample Data set for each Sourcetype.

Capturing a sample data set allows Cribl Pipeline and Packs to validate the logic against your sample data, and show a before and after view to prove that your Reduction and Enrichment use cases are working.

  1. At the Source, go to Live Data -> Save a sample file (lower left corner).
  2. Rename the file and add the Sourcetype as part of the name.
  3. Select Remove the Expiration in 24 hours (that will make sure you have the sample file forever).
  4. Click Save.

As an alternative to capturing sample data at the Source, use QuickConnect to capture a sample dataset.

In the QuickConnect UI, when you hover over the destination, you can click on Capture. The Capture button captures a sample of data flowing through the Source.

  1. Click on Capture sample data.
  2. Rename the file and add the Sourcetype as part of the name.
  3. Select Remove the Expiration in 24 hours (that will make sure you have the sample file forever).
  4. Click Save.

As an alternative to capturing sample data at the Source, use Routes to capture a sample dataset:

  1. At the Route, click a Route’s Options (…) menu.
  2. Click Capture sample data.
  3. Rename the file and add the Sourcetype as part of the name.
  4. Remove Expiration in 24 hours (that will make sure you have the sample file forever).
  5. Click Save.

10. Create the Pipeline

For your use cases you will: 

Streamline the number of fields or volume of data you send to your analysis tool:

    • Capture sample data to be used with the Pipeline
    • Create a Pipeline to reduce and manipulate the data. Use the Eval, Drop, Suppress, or Sample functions to streamline your data
    •  Keep on modifying and compare the savings you desired with Cribl’s basic statistics UI (See a Second Look at Our Data

Modify Logs to Metrics:

      • Capture sample data to be used with the pipeline
      • Create a pipeline to transform and manipulate the data. Use the Publish Metrics, Aggregation, or Numerify functions
      • Keep on modifying and compare using the IN and OUT view 

Enrich data with third-party sources

    • Capture sample data to be used with the pipeline
    • Potentially create a Lookup Table or run a Rest API collector to populate a Lookup Table
    • Create a pipeline to Enrich data. Use the Eval, Lookup Table, or Redis functions. Potentially use the out-of-the-box Packs to see examples of Enrichment or creating Common Information Model in Cribl
    •  Keep on modifying until the data OUT view matches what you’d expect to see

Transform data to prepare it with Common Information Model fields (for Splunk) or Elastic Common Schema (ECS for Elastic):

    • Capture sample data to be used with the pipeline
    • Create a pipeline to Transform data. Use the Eval, Rename, Auto Timestamp, or Mask functions. 
    • Keep on modifying until the data OUT view matches what you’d expect to see

Packs enable Cribl Stream administrators to pack up and share Pipelines and Functions across organizations, and include sample data for testing. The following Packs might be helpful:

    • AWS VPC Flow for Security Teams
    • Cisco ASA
    • Cribl-AWS-Cloudtrail-logs
    • Cribl-Carbon-Black
    • Cribl-Fortinet-Fortigate-Firewall
    • Cribl Pack for Nix
    • CrowdStrike Pack
    • Microsoft Office Activity
    • Microsoft Windows Events
    • Palo Alto Networks
    • Splunk UF Internal Pack
  1. Select any relevant Packs.
    1. Choose up to 3 packs for your data sources and add each Pack. Click on the Pack listing in the Dispensary or consult the internal README.md for the benefits each pack provides.
  2. Use Pack sample data
    1. Take any necessary additional steps to configure the Pack.
      Note that “Built by Cribl” Packs are supported via Cribl Community Slack.
      Community-contributed Packs are best effort support. All Packs are validated prior to being publicly listed.
    2. Take a look at the Pack’s Pipelines, each of which has a corresponding sample file.
      1. With a Pipeline selected, open the sample file in Simple Preview mode.
      2. To see the before and after for that dataset, click Sample Data In / Sample Data Out.
      3. To view information about volume reduction, event reduction, and more, click Basic Statistics.
    3. Modify your QuickConnect or Route to include the Pack you want to add to Cribl.Cloud.

As an alternative to Packs and the out-of-the-box Pipelines that are part of the Packs, you can create your own Pipeline. Pipelines are Cribl’s main way to manipulate events. Examine Cribl Tips and Tricks for additional examples and best practices. Look for all the sections that have Try This at Home for Pipeline examples https://docs.cribl.io/stream/usecase-lookups-regex/#try-this-at-home

      1. 1) Download the Cribl Knowledge Learning Pack from the Pack Dispensary for more cool Pipeline samples.
        2) Examine best practice links (for example, the Syslog best practices.
      2. Add a new pipeline, named after the dataset it will be used to process.
      3. Use the sample dataset you captured as you built your Pipeline.
      4. Add or edit functions to reduce, enrich, redact, aggregate, or shape your data as needed.Confirm the desired results in the Out view and basic statistics UI.
  1. Add a new pipeline, named after the dataset it will be used to process.
  2. Use the sample dataset you captured as you built your Pipeline.
  3. Add or edit functions to reduce, enrich, redact, aggregate, or shape your data as needed. Confirm the desired results in the Out view and basic statistics UI.

11. Review Your Results

  1. Repeat any of the above steps until the Source, QuickConnect, Routes, Pipelines, and Destinations are supporting your business case.
  2. Determine if you achieved your testing goals for this dataset, and note your results.
  3. Finally, summarize your findings. Common value our customers see include:
    1. Cost savings (infrastructure, license, cloud egress).
    2. Optimized analytics tools (prioritizing relevant data accelerates search and dashboard performance).
    3. Future-proofing (enabling choice and mitigating vendor-lock).

Please note: If you are working with existing data source being sent to your downstream systems and you do nothing to the output from Cribl Stream, it has a may break any existing dependencies on the original format of the data. Be sure to consult this Best Practices blog or the users and owners of your downstream systems before committing any data source to a destination from within Cribl Stream.

 

Technical Use Cases Tested:

  • Routing: Here’s how to see your results:
    To confirm data flow through the whole system we’ve built, select Monitoring > Data > Routes and examine demo.
    Also select Monitoring > Data > Pipelines and examine slicendice.

For additional examples, see:

When you’re convinced that Stream is right for you, reach out to your Cribl team and we can work with you on advanced topics like architecture, sizing, pricing, and anything else you need to get started!

Close
Close