LogStream Cloud provides a simple, secure way to manage globally distributed observability LEARN MORE

Using Webhooks in LogStream to Trigger Incidents in the PagerDuty API

Jon Rust
Written by Jon Rust

July 20, 2021

Webhook destinations have been available in LogStream since 2020 (LogStream version 2.4.4), and Packs since July of 2021. In this blog post, we’ll cover using Webhooks to trigger incidents in the PagerDuty API, and the Cribl Webhook Pagerduty Pack created to demonstrate how Packs make deployment easier.

Sending Notifications via Webhooks

LogStream’s core competency is providing an observability pipeline: Streaming events from sources to destinations with a library of functions to route, reduce, transform, enrich, and replay that data. Normally this entails taking some subset of events from a given log producer and delivering them to one or more destinations.

The Webhook destination adds a new wrinkle to LogStream’s arsenal. It allows for a way to call out to external services, using industry-standard HTTPS and JSON, to trigger events elsewhere. If a service you use accepts JSON payload via POST, PUT, or PATCH, LogStream can use it to bridge the gap between your machine data and third-party functionality.

In this case, we’re going to use the Webhook feature to fire based on data in the Cribl Internal Metrics data source. If any outputs on workers begin queuing up, an event will be sent to PagerDuty where their process of acknowledgement and resolution can be followed.

Let’s get started.

Persistent Queues

Destinations in LogStream have three options when the destination service can’t be reached: Drop the event, block the pipeline, or queue the data. In the case of queuing the data, LogStream will locally store the events until the service is available again, at which point the queue will drain. But… turns out, disk resources aren’t unlimited! Who knew?!

I absolutely recommend having a proper disk-space monitor on all your servers, regardless of persistent queuing (PQ) settings. Nagios or any other of the hundreds of monitoring tools are purpose-built for this. But by watching LogStream’s internal metrics logs, we can have another canary in the coal mine.

Internal Metrics

If you’ve pulled up the Monitoring panel in LogStream, you’ve seen internal metrics at work. But did you know you can treat those metrics like any other log source? Once you enable the source, you can route, transform, and aggregate any way you see fit before delivery to any of our supported destinations. (Hint: We have a Splunk app for LogStream monitoring.) So that will be our first stop.

Navigate to the Sources screen, and click the Cribl Internal icon. Then enable the Cribl Internal metrics log source. The default values are fine.

PagerDuty API Webhook Delivery Setup

Next, navigate to the Destinations configuration screen. Select Webhook.

Click Add New, and use the following URL:

https://events.pagerduty.com/v2/enqueue

Your config screen should look something like this:

PagerDuty API integration with LogStream Webhooks

Additionally, under Post-Processing, clear the System fields input.

The Pipeline

As usual, all the fancy work is done in the pipeline. I’m going to walk through the functions used in the pipeline here, but you can get the same functionality with the Pack, discussed in the last section of this post.

Navigate to Pipelines and create a new one. I’ve named mine trigger_pipe.

Next, let’s add a drop function to only allow the events we want to examine in the rest of the pipeline. For queue watching, we’ll use this filter expression, and enable the Final flag:

!((_metric == 'cribl.logstream.pq.queue_size') && (_value > 0) )

In other words, if the event is not a pq.queue_size event with _value > 0, it will be dropped.

Next, we’ll add the Eval function. This is the core of the whole pipeline. We need to define the fields the PagerDuty API needs while integrating relevant info from the event into the payload. The screenshot below shows the basics. You’ll need to enter your own Integration token from PagerDuty in place of the routing_key shown below.

PagerDuty API integration with LogStream Webhooks

In Keep Fields, enter: routing_key event_action dedup_key payload*

In Drop Fields, enter: *

Finally, add a Suppress function to limit how many events we send to PagerDuty. We’re going to use dedup_key (defined in the Eval above) as the Ke$y Expression, and use a 5-minute window (300 seconds) as the Suppression Period. In other words, for every host-output combination, we’ll allow one event through every 5 minutes.

The Route

Finally, create a new route with a Filter matching __inputId=='cribl:CriblMetrics', point it at your new pipeline, and your Webhook destination.

Testing PagerDuty API integration

You can force a queuing action by blocking access to a destination, or stopping a service. In my lab, I used a Splunk HEC endpoint for delivery of Datagen-created events. To test failure, I simply stopped the Splunk service. Within 30 seconds, a new incident should show up in your PagerDuty control panel.

Conclusion on Webhooks in LogStream and the PagerDuty API

Using an aggressive filtering option and a Suppress function, we can pare down our stream to just a few key events to trigger Webhook events, and in turn, PagerDuty incidents. While we used internal metrics for this example, any event in your pipeline could be used as a trigger just as easily. The Webhook destination adds more flexibility to an already robust set of options in LogStream.

But Wait… Packs?

LogStream Packs simplify deployment. There are 2 primary cases that Packs play into:

  1. The overworked admin, trying to maintain pipelines and knowledge artifacts in worker groups spread across their estate.
  2. The Cribl Community, sharing innovative and possibly complex recipes that others could make use of.

In both cases, it can be cumbersome to ensure pipelines, routes, lookups, and other artifacts are all accounted for in each install.

The Pack concept solves this by letting you bundle these things into a portable file format, easily installed into any LogStream 3.0+ installation. The PagerDuty API example above isn’t terribly complicated, but distributing it using a Pack makes it all the easier. No chance of typos, no missed steps in the Pipeline. Just drop it in, fill in your integration token, and you’re off to the races. When creating more complex Pipelines that involve lookups and other elements, the Pack advantage will be even more clear.

In fact, I’ve added an extra tweak to this Pack as a demonstration. Instead of hard-coding your integration token in an Eval, I’ve included a lookup file. Based on matching the host that sent the internal message, you can have different tokens. Maybe your staging team has a different PagerDuty setup than your production team. Since it’s a Pack, there’s no need for you to bother with creating the lookup. The plumbing’s all in place, just put your values in.

Once you’ve signed up for LogStream, you can find the Pack for Cribl Internal Metrics-triggered PagerDuty alerts in Cribl’s GitHub. Included with the Pack is a Readme with notes on how to configure it. The preceding blog post (hopefully) was interesting and showed you something new. But with Packs, we don’t require explanations at this level of detail. Grab the Pack, install it, and follow the directions.

The fastest way to get started with Cribl LogStream is to sign-up at Cribl.Cloud. You can process up to 1 TB of throughput per day at no cost. Sign-up and start using LogStream within a few minutes.

Questions about our technology? We’d love to chat with you.