Cribl Search empowers users to explore and analyze data directly at its source. However, finding sample data for testing these queries can be time-consuming. To overcome this, Cribl Search provides $vt_dummy
, a built-in virtual table designed to generate dummy events on demand.
Think of $vt_dummy
as a virtual data faucet for Cribl Search. It eliminates the need to gather and prepare external log data allowing you to construct test scenarios directly within your queries, ensuring they function flawlessly in various situations. For a deeper dive into $vt_dummy
functionalities and syntax, refer to the Cribl documentation for $vt_dummy.
The $vt_dummy
table acts as a virtual data source within Cribl Search. This means it doesn’t store real data; instead, it generates events when you request them. To access it, simply use the following syntax in your search query:
// Create a single event using the "$vt_dummy" virtual table
dataset="$vt_dummy"
Adding comments // to searches for readability is a great practice. This base query will return a single event containing two predefined fields:
_time
: This field represents the timestamp associated with the event.dataset
: $vt_dummy
in this context.The image above displays a single event with two predefined fields. While this provides a starting point, $vt_dummy’s true power lies in its ability to generate controlled sample data sets. In the upcoming sections, we’ll explore how to leverage parameters to customize the number of events, simulate long queries, and even create bursts of events within a specific timeframe.
While the basic dataset="$vt_dummy"
query returns a single event, testing often requires controlled sets. This is where parameters come into play. Additional parameters can be added to the dataset="$vt_dummy"
search to emulate real-world scenarios with your data.
In this example, we’ll use the event<NumberOfEvents parameter to specify the number of events to generate. Let’s look at the following query:
// Create five events using "$vt_dummy"
dataset="$vt_dummy" event<5
Here, event<5
is the parameter instructing $vt_dummy
to generate five events. Each event will also contain the two predefined fields (_time
and dataset
). However, when using the event
parameter, $vt_dummy
automatically includes a third field named event
. This event
field provides a basic numbering system within your generated events, assigning a sequential number starting from 1 for each event. This numbering can help differentiate between events in your test scenarios.
This query will return five dummy events, each containing the following fields:
_time
: Timestamp associated with the event.dataset
: $vt_dummy
in this context.event
: Sequential number of the events.By using the event<NumberOfEvents
parameter, you can easily generate controlled sample data sets with automatic numbering for your Cribl Search query testing needs. In the upcoming sections, we’ll explore even more advanced functionalities of $vt_dummy to create useful test scenarios.
So far, we’ve seen how to generate a single event and control the number of events with automatic numbering. But what if you need to test queries that handle data arriving over a specific time interval? This is where the second<SearchRuntime parameter comes into play.
The second<SearchRuntime>
parameter instructs $vt_dummy
to generate events with timestamps spread over the specified number of seconds, as shown in the following query:
// Create one event per second using "$vt_dummy"
dataset="$vt_dummy" second<5
In this example, we added a comment //. Comments can be used throughout your searches to help others understand what the search is doing. The second<5 tells $vt_dummy to generate events with a one-second gap between their timestamps, effectively simulating events arriving over a five-second timeframe. Each event will also contain the two predefined fields (_time
and dataset
). However, we’re also introducing a new custom field named second that is automatically generated by $vt_dummy
when using the second<number parameter. It assigns a sequential number starting from 0, indicating the order of the event within the simulated timeframe. This can help track the event sequence during testing.
This query will return five dummy events, each spaced one second apart, containing the following fields:
_time
: Timestamp associated with the event, reflecting the one-second intervals.dataset
: $vt_dummy in this context.second
: Sequential number assigned based on the event order within the five-second timeframe (0-4 in this case).By using the second<SearchRuntime parameter, you can simulate real-world scenarios where events are generated over a specific time period. This is valuable for testing how Cribl Search queries handle data streams. In the next section, we’ll explore another way to create bursts of events within a timeframe using both event and second in the search.
Building on the concepts from the previous examples, let’s explore how to create bursts of events within a specific timeframe, while also capturing the order of events within that time frame. This can be helpful for simulating scenarios where you receive a surge of logs in a short period, and the order of those logs matters. In this example, we’ll combine the event and second parameters to achieve this. Let’s look at the following query:
dataset="$vt_dummy" event<3 second<5
Here, event
instructs $vt_dummy
to generate three events, while second
specifies a five-second timeframe. This combination creates a scenario with bursts of events, resulting in a total of approximately fifteen events.
Cribl Search will generate the event and second fields. The second field assigns a sequential number to each event, reflecting the order in which it was generated within the timeframe (0-4 in the case of a five-second timeframe). The automatic event field will still be present, providing a separate sequential number for each event (1-15 in this case).
The exact timestamps and distribution of events may vary slightly due to Cribl Search’s internal processing. However, you can expect to see three events per second, each containing the following fields:
_time
: Timestamp associated with the event.dataset
: $vt_dummy
in this context.event
: Sequential number assigned to the event (1-3 in this case).second
: Sequential number assigned based on the event within the five-second timeframe (0-4 in this case).By combining event and second parameters, you can create realistic scenarios with bursts of events distributed over a timeframe.
Within the Cribl Search query pipeline, the extend
operator empowers you to manipulate and enrich data generated by $vt_dummy, allowing you to simulate events with specific fields relevant to your testing scenario.
Here’s an example of how to use extend to add a custom field named foo to events generated by $vt_dummy
:
dataset="$vt_dummy" event<2 | extend foo=42
This query will generate two Cribl Search events, each containing the following fields:
_time
: Timestamp associated with the event.dataset
: $vt_dummy
in this context.event
: Sequential number assigned to the event (1-2 in this case).foo
: A field containing a numeric value of 42.By using the extend operator, you can create custom fields with various data types to simulate more complex log events for your Cribl Search query testing purposes.
In Cribl Search, the extend
operator empowers you to craft diverse test data. It goes beyond fixed values, allowing you to create events with random and conditional elements using functions like rand
and iif
. Here’s a Cribl Search query demonstrating how to use extend
with functions for random and conditional data generation:
dataset="$vt_dummy" event<3 second<4 | extend foo=rand(42),bar=iif(event%2>0, "Odd", "Even")
This query generates three events per second over four seconds for a total of twelve events, each containing two additional fields:
foo=rand(42)
: Generates a random integer between 0 (inclusive) and 42 (exclusive).bar=iif(event%2>0, "Odd", "Even")
: This function uses conditional logic. It checks if the event number divided by 2 has a remainder greater than 0. If true, it assigns Odd
to the field, otherwise Even
.By combining rand
and iif
functions with the extend
operator, you can create custom fields with various data types. Cribl Search offers a rich library of functions that can be used with the extend
operator to generate even more intricate test scenarios, catering to diverse testing needs.
In Cribl Search, the extend
and sort
operators work together to manipulate data for testing purposes. Here’s a Cribl Search query demonstrating how:
dataset="$vt_dummy" event<5 | extend _time = _time - rand(600), bar = iif(event%2>0, "Odd", "Even")
| sort by _time desc
This example showcases extend
and sort
working together. It manipulates timestamps with random shifts (up to 600 seconds) and assigns values of Odd
or Even
to a new field bar
based on the original event order. Finally, it sorts by _time
in descending order. This demonstrates control over test data chronology for diverse scenarios.
This Cribl Search query injects randomness into _time
and visualizes the distribution of events using the timestats
operator to plot bar
values (<Odd
or Even
) over _time
with one-minute spans:
dataset="$vt_dummy" event<5 | extend _time = _time - rand(600), bar = iif(event%2>0, "Odd", "Even")
| timestats span=1m count() by bar
By using timestats
with span=1m
and count()
, you can visualize the distribution of events over one-minute time intervals, effectively analyzing the impact of randomized timestamps on event distribution.
Let’s take Cribl Search queries to the next level! This example showcases manipulating the _raw
field to craft realistic scenarios with randomized data. We’ll focus on randomizing the breed and timestamp in a sample access log using regular expressions and conditional logic.
Run the following search and review the breakdown below to understand how replace_regex
and strftime
work in this query:
dataset="$vt_dummy" event<2 second<5
| extend _raw = '82.34.111.190 - - [25/Jun/2024:15:42:13 -0500] "GET /products/goats/breeds?breed=Pygmy&sort=price_asc&page=2&limit=10 HTTP/1.1" 200 4230 "https://www.happybleats.com/breeds" "Mozilla/5.0 (iPhone; CPU OS 17_5 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Mobile/XXXXX Safari/604.1"'
| extend tmp_timestamp = strftime(_time,'%d/%b/%Y:%H:%M:%S %Z')
| extend _raw=replace_regex(_raw,@'(\d{2}\/\w{3}\/\d{4}:\d{2}:\d{2}:\d{2} [^\]].+?)\]',tmp_timestamp)
| extend tmp_breed=rand(6)+1, breed=case(tmp_breed==1, "Alpine", tmp_breed==2,"La Mancha",tmp_breed==3,"Nubian",tmp_breed==4,"Saanen",tmp_breed==5,"Boer","Kiko")
| extend _raw=replace_regex(_raw,@'(breed=)([^&]+?)(&)',@'\1'+breed+@'\3')
| summarize by breed
Generate events: dataset="$vt_dummy" event<2 second<5
Set _raw field: | extend _raw = '82.34.111.190…
Set tmp_timestamp to match the format in _raw: | extend tmp_timestamp = strftime(_time,'%d…
Replace timestamp in _raw: | extend _raw=replace_regex(_raw,@'(\d{2}\/\w…
Randomize breed: | extend tmp_breed=rand(6)+1, breed=case(tmp_breed==1…
Replace breed in _raw: | extend _raw=replace_regex(_raw,@'(breed=)…
Summarize by breed: | summarize by breed
By manipulating the _raw
field and injecting randomized breeds, you can create access logs for in-depth analysis of query behavior when dealing with product variations or filtering criteria.
Cribl Search queries are essential for log data analysis, but finding sample data for testing can be a pain. This guide introduced you to $vt_dummy
, a built-in virtual table that generates sample events on demand. With $vt_dummy
, you can craft realistic test scenarios directly within your queries, ensuring they function flawlessly across various situations. From controlling the number of events to simulating bursts and manipulating timestamps, $vt_dummy
empowers you to create comprehensive test data. This translates to time saved, improved query performance, and ultimately, a smoother Cribl Search experience.
Here’s a recap of what we covered:
$vt_dummy
eliminates the need for external log data by generating events on-demand, allowing you to construct test scenarios directly within your queries.event<NumberOfEvents
and second<SearchRuntime>
.extend
operator empowers you to manipulate and enrich $vt_dummy
‘s events with custom fields and data types, making your test scenarios more realistic.rand
and iif
can be used with the extend
operator to create events with random values and conditional logic, catering to diverse testing needs.sort
and extend
operators work together to control the order and timestamps of events, allowing you to test queries involving time-based scenarios.timestats
operator helps visualize the distribution of events over time intervals, making it easier to analyze the impact of randomized elements._raw
field with regular expressions and conditional logic, you can craft realistic scenarios with randomized data, like access logs with varying product details.Overall, $vt_dummy
and its functionalities empower you to create comprehensive test scenarios for your Cribl Search queries, ensuring they function flawlessly across diverse data sets and situations. But wait, there’s more to come! While this blog post focused on $vt_dummy
, Cribl Search offers a powerhouse of additional functionality to help you create sample events. Stay tuned for upcoming dives into the let
statement, and the print
, lookup
, and externaldata
operators! With these tools in your arsenal, you’ll be a Cribl Search master in no time.
Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.
We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.
Experience a full version of Cribl Stream and Cribl Edge in the cloud with pre-made sources and destinations.