Solutions

Use Cases

Initiatives

Technologies

Industries

Route
Route data to multiple destinations

Enrich
Enrich data events with business or service context

Search
Search and analyze data directly at its source, an S3 bucket, or Cribl Lake

Reduce
Reduce the size of data

Transform
Shape data to optimize its value

Store
Store data in S3 buckets or Cribl Lake

Replay
Replay data from low-cost storage

Collect
Collect logs and metrics from host devices

Universal Receiver
Centrally receive and route telemetry to all your tools

Redact
Redact or mask sensitive data

Interactive Demos See all Integrations

Supercharge Security Insights
Optimize data for better threat detection and response

Agent Consolidation
Streamline infrastructure to reduce complexity and cost

Tackle Application Infrastructure Sprawl
Simplify Kubernetes data collection

Reduce Log Volume
Optimize logs for value

Slash Storage Costs
Control how telemetry is stored

Accelerate Cloud Migration
Easily handle new cloud telemetry

Avoid Vendor Lock-In
Ensure freedom in your tech stack

AIOps Optimization
Accelerate the value of AIOps

Interactive Demos See all Integrations

See all Integrations

Seamless Integrations to Power All Your Tools See all Integrations

Interactive Demos See all Integrations

Healthcare

Managed Security Services

Manufacturing & Logistics

Media & Entertainment

Public Sector

Retail

Financial Services

Interactive Demos See all Integrations
Products

Overview

Products

Services

Cribl Products Overview

Effortlessly search, collect, process, route and store telemetry from every corner of your infrastructure—in the cloud, on-premises, or both—with Cribl. Try the Cribl Suite of products today.
Learn more

Learn more

Featured News Story
Cribl closes $319M oversubscribed Series E at $3.5B valuation!
Learn more

Interactive Demos Pricing Support

Stream
Get telemetry data from anywhere to anywhere

Cribl.Cloud
Get started quickly without managing infrastructure

Edge
Streamline collection with a scalable, vendor-neutral agent

Copilot
AI-powered tools designed to maximize productivity

Search
Easily access and explore telemetry from anywhere, anytime

Appscope
Instrument, collect, observe

Lake
Store, access, and replay telemetry.

Interactive Demos Pricing Support

Activation Services
Get hands-on support from Cribl experts to quickly deploy and optimize Cribl solutions for your unique data environment.

Service Delivery Partners
Work with certified partners to get up and running fast. Access expert-level support and get guidance on your data strategy.

Interactive Demos Pricing Support
Customers

Customer Stories

Customer Highlights

Customer Stories

Get inspired by how our customers are innovating IT, security, and observability. They inspire us daily!
Read customer stories

Watch now

In Action!
See how our customers use Cribl as their data engine for IT and Security
Watch now

Sally Beauty
Replacing LogStash and Syslog-ng with a resilient pipeline
Learn more

Yale New Haven
Reducing SIEM burden and revamping security infrastructure
Learn more

Aflac
Gotta catch 'em all! Simplifying data onboarding across sources
Learn more

SAP
Accelerating SAP Enterprise Cloud Services' security initiatives
Learn more

Autodesk
Metrics, OTel and more: Modernizing an enterprise data pipeline
Learn more

Nutanix
Reducing firewall log volume by 50%
Learn more
Learning & Resources

Learning

Cribl University
FREE training and certs for data pros

Cribl University LogIn
Log in or sign up to start learning

Docs

Tech Docs
Step-by-step guidance and best practices

Self Guided Trials
Tutorials for Sandboxes & Cribl.Cloud

Community

Slack
Ask questions and share user experiences

Curious Knowledge Base
Troubleshooting tips, and Q&A archive

Downloads

Download Library
The latest software features and updates

Past Releases
Get older versions of Cribl software

Support

Support Portal
For registered licensed customers

Customer Success
Advice throughout your Cribl journey

Blog & Podcasts

Events

Webinars

Briefs & Papers

Packs

GitHub Repos

Docker Hub

Glossary

Telemetry 101

Observability 101
Pricing

Plans

ROI calculator
About

Cribl

Partners

About Cribl

Transform data management with Cribl, the Data Engine for IT and Security.
Learn more

Company Careers News Contact Leadership Cribl for Startups

Learn more

Featured News Story
Cribl closes $319M oversubscribed Series E at $3.5B valuation!
Learn more

Find a Partner
Connect with Cribl partners to transform your data and drive real results.

Partner Program
Join the Cribl Partner Program for resources to boost success.

Partner Login
Log in to the Cribl Partner Portal for the latest resources, tools, and updates.

Cribl Search’s Secret Weapon: Sample Events Made Easy

August 19, 2024

Written by

Categories: Cribl Search

Back To Blogs

Cribl Search empowers users to explore and analyze data directly at its source. However, finding sample data for testing these queries can be time-consuming. To overcome this, Cribl Search provides $vt_dummy, a built-in virtual table designed to generate dummy events on demand.

What is $vt_dummy?

Think of $vt_dummy as a virtual data faucet for Cribl Search. It eliminates the need to gather and prepare external log data allowing you to construct test scenarios directly within your queries, ensuring they function flawlessly in various situations. For a deeper dive into $vt_dummy functionalities and syntax, refer to the Cribl documentation for $vt_dummy.

Example 1: Accessing the $vt_dummy Table

The $vt_dummy table acts as a virtual data source within Cribl Search. This means it doesn’t store real data; instead, it generates events when you request them. To access it, simply use the following syntax in your search query:

// Create a single event using the "$vt_dummy" virtual table
dataset="$vt_dummy"

Adding comments // to searches for readability is a great practice. This base query will return a single event containing two predefined fields:

_time: This field represents the timestamp associated with the event.
dataset: $vt_dummy in this context.

The image above displays a single event with two predefined fields. While this provides a starting point, $vt_dummy’s true power lies in its ability to generate controlled sample data sets. In the upcoming sections, we’ll explore how to leverage parameters to customize the number of events, simulate long queries, and even create bursts of events within a specific timeframe.

Example 2: Generating Multiple Events

While the basic dataset="$vt_dummy" query returns a single event, testing often requires controlled sets. This is where parameters come into play. Additional parameters can be added to the dataset="$vt_dummy" search to emulate real-world scenarios with your data.

In this example, we’ll use the event<NumberOfEvents parameter to specify the number of events to generate. Let’s look at the following query:

// Create five events using "$vt_dummy"
dataset="$vt_dummy" event<5

Here, event<5 is the parameter instructing $vt_dummy to generate five events. Each event will also contain the two predefined fields (_time and dataset). However, when using the event parameter, $vt_dummy automatically includes a third field named event. This event field provides a basic numbering system within your generated events, assigning a sequential number starting from 1 for each event. This numbering can help differentiate between events in your test scenarios.

This query will return five dummy events, each containing the following fields:

_time: Timestamp associated with the event.
dataset: $vt_dummy in this context.
event: Sequential number of the events.

By using the event<NumberOfEvents parameter, you can easily generate controlled sample data sets with automatic numbering for your Cribl Search query testing needs. In the upcoming sections, we’ll explore even more advanced functionalities of $vt_dummy to create useful test scenarios.

Example 3: Simulating Events Over Time

So far, we’ve seen how to generate a single event and control the number of events with automatic numbering. But what if you need to test queries that handle data arriving over a specific time interval? This is where the second<SearchRuntime parameter comes into play.

The second<SearchRuntime> parameter instructs $vt_dummy to generate events with timestamps spread over the specified number of seconds, as shown in the following query:

// Create one event per second using "$vt_dummy"
dataset="$vt_dummy" second<5

In this example, we added a comment //. Comments can be used throughout your searches to help others understand what the search is doing. The second<5 tells $vt_dummy to generate events with a one-second gap between their timestamps, effectively simulating events arriving over a five-second timeframe. Each event will also contain the two predefined fields (_time and dataset). However, we’re also introducing a new custom field named second that is automatically generated by $vt_dummy when using the second<number parameter. It assigns a sequential number starting from 0, indicating the order of the event within the simulated timeframe. This can help track the event sequence during testing.

This query will return five dummy events, each spaced one second apart, containing the following fields:

_time: Timestamp associated with the event, reflecting the one-second intervals.
dataset: $vt_dummy in this context.
second: Sequential number assigned based on the event order within the five-second timeframe (0-4 in this case).

By using the second<SearchRuntime parameter, you can simulate real-world scenarios where events are generated over a specific time period. This is valuable for testing how Cribl Search queries handle data streams. In the next section, we’ll explore another way to create bursts of events within a timeframe using both event and second in the search.

Example 4: Simulating Event Bursts

Building on the concepts from the previous examples, let’s explore how to create bursts of events within a specific timeframe, while also capturing the order of events within that time frame. This can be helpful for simulating scenarios where you receive a surge of logs in a short period, and the order of those logs matters. In this example, we’ll combine the event and second parameters to achieve this. Let’s look at the following query:

dataset="$vt_dummy" event<3 second<5

Here, event instructs $vt_dummy to generate three events, while second specifies a five-second timeframe. This combination creates a scenario with bursts of events, resulting in a total of approximately fifteen events.

Cribl Search will generate the event and second fields. The second field assigns a sequential number to each event, reflecting the order in which it was generated within the timeframe (0-4 in the case of a five-second timeframe). The automatic event field will still be present, providing a separate sequential number for each event (1-15 in this case).

The exact timestamps and distribution of events may vary slightly due to Cribl Search’s internal processing. However, you can expect to see three events per second, each containing the following fields:

_time: Timestamp associated with the event.
dataset: $vt_dummy in this context.
event: Sequential number assigned to the event (1-3 in this case).
second: Sequential number assigned based on the event within the five-second timeframe (0-4 in this case).

By combining event and second parameters, you can create realistic scenarios with bursts of events distributed over a timeframe.

Example 5: Generating Events with Custom Fields

Within the Cribl Search query pipeline, the extend operator empowers you to manipulate and enrich data generated by $vt_dummy, allowing you to simulate events with specific fields relevant to your testing scenario.

Here’s an example of how to use extend to add a custom field named foo to events generated by $vt_dummy:

dataset="$vt_dummy" event<2 | extend foo=42

This query will generate two Cribl Search events, each containing the following fields:

_time: Timestamp associated with the event.
dataset: $vt_dummy in this context.
event: Sequential number assigned to the event (1-2 in this case).
foo: A field containing a numeric value of 42.

By using the extend operator, you can create custom fields with various data types to simulate more complex log events for your Cribl Search query testing purposes.

Example 6: Creating Random Values Using Operators & Functions

In Cribl Search, the extend operator empowers you to craft diverse test data. It goes beyond fixed values, allowing you to create events with random and conditional elements using functions like rand and iif. Here’s a Cribl Search query demonstrating how to use extend with functions for random and conditional data generation:

dataset="$vt_dummy" event<3 second<4 | extend foo=rand(42),bar=iif(event%2>0, "Odd", "Even")

This query generates three events per second over four seconds for a total of twelve events, each containing two additional fields:

foo=rand(42): Generates a random integer between 0 (inclusive) and 42 (exclusive).
bar=iif(event%2>0, "Odd", "Even"): This function uses conditional logic. It checks if the event number divided by 2 has a remainder greater than 0. If true, it assigns Odd to the field, otherwise Even.

By combining rand and iif functions with the extend operator, you can create custom fields with various data types. Cribl Search offers a rich library of functions that can be used with the extend operator to generate even more intricate test scenarios, catering to diverse testing needs.

Example 7: Randomizing and Sorting by Time in Descending Order

In Cribl Search, the extend and sort operators work together to manipulate data for testing purposes. Here’s a Cribl Search query demonstrating how:

dataset="$vt_dummy" event<5 | extend _time = _time - rand(600), bar = iif(event%2>0, "Odd", "Even")
| sort by _time desc

This example showcases extend and sort working together. It manipulates timestamps with random shifts (up to 600 seconds) and assigns values of Odd or Even to a new field bar based on the original event order. Finally, it sorts by _time in descending order. This demonstrates control over test data chronology for diverse scenarios.

Example 8: Visualizing Random Time-Shifted Events

This Cribl Search query injects randomness into _time and visualizes the distribution of events using the timestats operator to plot bar values (<Odd or Even) over _time with one-minute spans:

dataset="$vt_dummy" event<5 | extend _time = _time - rand(600), bar = iif(event%2>0, "Odd", "Even")
| timestats span=1m count() by bar

By using timestats with span=1m and count(), you can visualize the distribution of events over one-minute time intervals, effectively analyzing the impact of randomized timestamps on event distribution.

Example 9: Simulating Access Logs with Randomized Breeds

Let’s take Cribl Search queries to the next level! This example showcases manipulating the _raw field to craft realistic scenarios with randomized data. We’ll focus on randomizing the breed and timestamp in a sample access log using regular expressions and conditional logic.

Run the following search and review the breakdown below to understand how replace_regex and strftime work in this query:

dataset="$vt_dummy" event<2 second<5
| extend _raw = '82.34.111.190 - - [25/Jun/2024:15:42:13 -0500] "GET /products/goats/breeds?breed=Pygmy&sort=price_asc&page=2&limit=10 HTTP/1.1" 200 4230 "https://www.happybleats.com/breeds" "Mozilla/5.0 (iPhone; CPU OS 17_5 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Mobile/XXXXX Safari/604.1"'
| extend tmp_timestamp = strftime(_time,'%d/%b/%Y:%H:%M:%S %Z')
| extend _raw=replace_regex(_raw,@'(\d{2}\/\w{3}\/\d{4}:\d{2}:\d{2}:\d{2} [^\]].+?)\]',tmp_timestamp)
| extend tmp_breed=rand(6)+1, breed=case(tmp_breed==1, "Alpine", tmp_breed==2,"La Mancha",tmp_breed==3,"Nubian",tmp_breed==4,"Saanen",tmp_breed==5,"Boer","Kiko")
| extend _raw=replace_regex(_raw,@'(breed=)([^&]+?)(&)',@'\1'+breed+@'\3')
| summarize by breed

Breakdown of the Cribl Search:

Generate events: dataset="$vt_dummy" event<2 second<5
Set _raw field: | extend _raw = '82.34.111.190…
Set tmp_timestamp to match the format in _raw: | extend tmp_timestamp = strftime(_time,'%d…
Replace timestamp in _raw: | extend _raw=replace_regex(_raw,@'(\d{2}\/\w…
Randomize breed: | extend tmp_breed=rand(6)+1, breed=case(tmp_breed==1…
Replace breed in _raw: | extend _raw=replace_regex(_raw,@'(breed=)…
Summarize by breed: | summarize by breed

By manipulating the _raw field and injecting randomized breeds, you can create access logs for in-depth analysis of query behavior when dealing with product variations or filtering criteria.

Key Takeaways

Cribl Search queries are essential for log data analysis, but finding sample data for testing can be a pain. This guide introduced you to $vt_dummy, a built-in virtual table that generates sample events on demand. With $vt_dummy, you can craft realistic test scenarios directly within your queries, ensuring they function flawlessly across various situations. From controlling the number of events to simulating bursts and manipulating timestamps, $vt_dummy empowers you to create comprehensive test data. This translates to time saved, improved query performance, and ultimately, a smoother Cribl Search experience.

Here’s a recap of what we covered:

A Virtual Data Faucet: $vt_dummy eliminates the need for external log data by generating events on-demand, allowing you to construct test scenarios directly within your queries.
Generating Sample Events: You can control the number of events, simulate events over time, and even create bursts of events within a specific time frame using parameters like event<NumberOfEvents and second<SearchRuntime>.
Customizing Test Data: The extend operator empowers you to manipulate and enrich $vt_dummy‘s events with custom fields and data types, making your test scenarios more realistic.
Advanced Data Manipulation: Cribl Search functions like rand and iif can be used with the extend operator to create events with random values and conditional logic, catering to diverse testing needs.
Sorting and Time Manipulation: The sort and extend operators work together to control the order and timestamps of events, allowing you to test queries involving time-based scenarios.
Visualizing Random Data Distribution: The timestats operator helps visualize the distribution of events over time intervals, making it easier to analyze the impact of randomized elements.
Simulating Complex Scenarios: By manipulating the _raw field with regular expressions and conditional logic, you can craft realistic scenarios with randomized data, like access logs with varying product details.

Wrap up

Overall, $vt_dummy and its functionalities empower you to create comprehensive test scenarios for your Cribl Search queries, ensuring they function flawlessly across diverse data sets and situations. But wait, there’s more to come! While this blog post focused on $vt_dummy, Cribl Search offers a powerhouse of additional functionality to help you create sample events. Stay tuned for upcoming dives into the let statement, and the print, lookup, and externaldata operators! With these tools in your arsenal, you’ll be a Cribl Search master in no time.

Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.

We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.

Blog

Cribl and CrowdStrike Partner to Transform Data Management for SIEM Solutions

Blog

Mastering Tail Sampling for OpenTelemetry: Cost-Effective Strategies with Cribl

Blog

The Stream Life Podcast 110: Microsoft Azure + Cribl – Better together

Try Your Own Cribl Sandbox

Experience a full version of Cribl Stream and Cribl Edge in the cloud with pre-made sources and destinations.

Launch Now

Cribl Products Overview

Customer Stories

About Cribl

Cribl Search’s Secret Weapon: Sample Events Made Easy

Written by

David Maislin

What is $vt_dummy?

Example 1: Accessing the $vt_dummy Table

Example 2: Generating Multiple Events

Example 3: Simulating Events Over Time

Example 4: Simulating Event Bursts

Example 5: Generating Events with Custom Fields

Example 6: Creating Random Values Using Operators & Functions

Example 7: Randomizing and Sorting by Time in Descending Order

Example 8: Visualizing Random Time-Shifted Events

Example 9: Simulating Access Logs with Randomized Breeds

Breakdown of the Cribl Search:

Key Takeaways

Wrap up

Blog

Cribl and CrowdStrike Partner to Transform Data Management for SIEM Solutions

Blog

Mastering Tail Sampling for OpenTelemetry: Cost-Effective Strategies with Cribl

Blog

The Stream Life Podcast 110: Microsoft Azure + Cribl – Better together

Try Your Own Cribl Sandbox

So you're rockin' Internet Explorer!