Tips & Tricks for Real-Time DNS Log Analysis

Products
Product Portfolio

Cribl puts your IT and Security data at the center of your data management strategy and provides a one-stop shop for analyzing, collecting, processing, and routing it all at any scale. Try the Cribl suite of products and start building your data engine today!
Learn more ›

Evolving demands placed on IT and Security teams are driving a new architecture for how observability data is captured, curated, and queried. This new architecture provides flexibility and control while managing the costs of increasing data volumes.
Read white paper ›

Cribl Stream

Cribl Stream is a vendor-agnostic observability pipeline that gives you the flexibility to collect, reduce, enrich, normalize, and route data from any source to any destination within your existing data infrastructure.
Learn more ›

Vodafone Case Study

Vodafone Dials up Business Insights with Cribl Stream
Read Case Study ›

Cribl Edge

Cribl Edge provides an intelligent, highly scalable edge-based data collection system for logs, metrics, and application data.
Learn more ›

SpyCloud Edge Story

Listen to how SpyCloud uses Cribl Edge at scale.
Watch Video ›

Cribl Search

Cribl Search turns the traditional search process on its head, allowing users to search data in place without having to collect/store first.
Learn more ›

Happy 1st Birthday Cribl Search!
Read Blog ›

Cribl Lake

Cribl Lake is a turnkey data lake solution that takes just minutes to get up and running — no data expertise needed. Leverage open formats, unified security with rich access controls, and centralize access to all IT and security data.
Learn more ›

Navigating the future of IT and Security Data management white paper
Read white paper ›

Cribl.Cloud

The Cribl.Cloud platform gets you up and running fast without the hassle of running infrastructure.
Learn more ›

Cribl.Cloud Solution Brief

The fastest and easiest way to realize the value of an observability ecosystem.
Read Solution Brief ›

AppScope

AppScope gives operators the visibility they need into application behavior, metrics and events with no configuration and no agent required.
Learn more ›

Sandbox

Launch an AppScope Sandbox today!
Launch Now ›
Solutions
Use Cases

Explore Cribl’s Solutions by Use Cases:

Supercharge Security Insights ›

Accelerate Cloud Migration ›

Avoid Vendor Lock-in ›

Free Up Space for High-Value Data ›

Route From Any Source To Any Destination ›

Replay Data from Low-Cost Storage ›

Reduce Log Volume & Pay Less for Infrastructure ›
Integration

Explore Cribl’s Solutions by Integrations:

Amazon ›

Google ›

CrowdStrike ›

Microsoft ›

Elastic ›

Splunk ›

Exabeam ›

View All Integrations ›

Seamless Integrations for Your Observability Data
Learn More ›
Industries

Explore Cribl’s Solutions by Industry:

AIOps ›

Financial Services ›

Healthcare ›

Managed Security Services ›

Manufacturing and Logistics ›

Communications and Media ›

Public Sector ›

Retail ›
Resources
Resources

Resource Library ›

Documentation ›

Guides ›

AppScope Docs ›

Blog ›

Glossary ›

Podcasts ›

Telemetry 101

Understanding the Basics of Telemetry and Its Benefits
Learn More ›
Events & Webinars

Events ›

Webinars ›

CriblCon24
Las Vegas // June 10, 2024
Register Now ›

April 24 | 10am PT / 1pm ET

3 ways to fast-track your data lake strategy without being a data expert
REGISTER ›
Learning

Try the Sandboxes ›

Self Guided Trials ›

Cribl University ›

Cribl Community ›

Cribl Curious Forum ›

What is Observability? ›

Try Your Own Cribl Sandbox

Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›
Tools & Pricing

Download Library ›

Past Releases ›

Pricing Plans ›

Stream ROI Calculator ›

Download Library

Download Cribl’s suite of products for free to get started.
Download ›
Customers
Customer Stories

Get inspired by how our customers are innovating IT, security and observability. They inspire us daily!
Read Customer Stories ›

Sally Beauty Holdings

Sally Beauty Swaps LogStash and Syslog-ng with Cribl.Cloud for a Resilient Security and Observability Pipeline
Read Case Study ›
Customer Experience

Support & Success ›

Professional Services ›

Service Delivery Partners ›

Documentation ›

AppScope Docs ›

Professional Services

Check out our new Professional Services offering.
Learn More ›
Learning

Try the Sandboxes ›

Self Guided Trials ›

Cribl University ›

Cribl Community ›

Cribl Curious Forum ›

Try Your Own Cribl Sandbox

Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›
Company
About Cribl

Transform data management with Cribl, the Data Engine for IT and Security
Learn More ›

Cribl Corporate Overview

Cribl makes open observability a reality, giving you the freedom and flexibility to make choices instead of compromises.
Get the Guide ›

Cribl Newsroom

Stay up to date on all things Cribl and observability.
Visit the Newsroom ›

Press Releases

Read our most recent press releases.
Recent Press Releases ›

Leadership

Cribl’s leadership team has built and launched category-defining products for some of the most innovative companies in the technology sector, and is supported by the world’s most elite investors.
Meet our Leaders ›

Careers

Join the Cribl herd! The smartest, funniest, most passionate goats you’ll ever meet.
Learn More ›

Cribl Named to the Inc. 5000 List of Fastest Growing Private Companies
Learn More ›

Cribl for Startups

Whether you’re just getting started or scaling up, the Cribl for Startups program gives you the tools and resources your company needs to be successful at every stage.
Learn More ›

Contact Us

Want to learn more about Cribl from our sales experts? Send us your contact information and we’ll be in touch.
Talk to an Expert ›

Try Cribl Talk to an expert

Written by Dritan Bitincka

January 28, 2019

In a previous post we showed how to use detect data exfiltration with LogStream in real-time. The analysis focused on checking DNS labels from DNS logs for presence of base64 encoded data.

In this post we will look at several other techniques that can help security engineers add dimensions to the data to help improve the fidelity and accuracy of their analysis.

1. Use Lookups to filter out events from known good domains

DNS logs for external requests are extremely noisy. The majority of the traffic tends to go to top ~1K most popular domains, (e.g. google.com, facebook.com, wikipedia.org … yes, aol.com, too) and the likelihood that either of those is fully compromised is pretty low. If you can live with that assumption, it makes sense to filter or sample their logs before the rest of the analysis is completed. This will make querying at the end system way more efficient. Full fidelity data can be sent to long term storage for future analysis or to fulfill audit or compliance requirements.

First, get the top domains list from Alexa, Umbrella, Quantcast, or some other reputable source. They’re usually pretty long so feel free to truncate it to top 1K-10K or so. Almost all will have a format similar to this:

rank,domain 1,google.com 2,youtube.com 3,facebook.com ... N,amazon.com

Use Regex Extract function to extractsubdomain and domain fields from the log.

Use the Lookup function to get rank field given the domain name.

Drop events that have a rank field (i.e. domain present in the list).

Notice how sample logs showing queries to Wikipedia, Facebook and Google have been dropped.

2. Use NOD or NRD lists

Newly Observed Domains (NODs) or Newly Registered Domains (NRDs) tend to be used by adversaries for additional malicious activities such as botnet coordination, spam or malware distribution. Companies such as Farsight Security track, curate and publish these lists daily. If you have access to such lists you can bring them into Cribl as lookups and do either of the following:

Use these lists instead of the above to base your filtering on, i.e., instead of dropping events that are from top 1k domains, you can drop those that are not in a NOD/NRD list.
Use these lists to add a field to each event if there is a match on the subdomain or the full domain.

3. Enrich events with enhanced DNS label string calculations

Decorating each of the remaining logs with additional fields helps with our analysis.

2.1 Improved base64 detection

Use the new C.Decode.base64() function and validate that its output is actually UTF8, by passing the 'utf8-valid' parameter: C.Decode.base64(subdomain, 'utf8-valid'). This is necessary because not all subdomains will be base64 encoded, e.g., live in live.foo.com is not a valid base64 encoded string.

2.2. Add subdomain’s (Shannon) `entropy` to each event

Another interesting dimension that we can add to each event is the Shannon Entropy of the subdomain string. Linguistically speaking, entropy of a word/string can be thought of as a measure of its characters’ randomness. Since exfil data base64 encodings and other malicious domains (e.g. those used by malware etc.) typically have random-looking – or rather, more random looking than “normal” – subdomains, computing a score of randomness will helps us improve the fidelity of our analysis. Shannon entropy can be calculated with C.Text.entropy(subdomain).

2.3. Add DNS label directed divergence or `relativeEntropy` to each event

While Shanon entropy works well for random data, a potentially more accurate way to measure the gibberish-ness of domains/subdomains would be one that compares its randomness against another non-random distribution. Per Wikipedia, “In mathematical statistics, the Kullback–Leibler divergence (also called relative entropy) is a measure of how one probability distribution is different from a second, reference probability distribution.“. An example of a reference probability distribution would be the English language. An even better example is that of the characters that appear in the list of most popular domains and subdomains as tracked by Alexa, Umbrella, Quantcast, etc. And, that’s exactly what the relativeEntropy function uses as a baseline model.

Notice that the relativeEntropy, as expected, results in a lower value than Shanon’s. If you’d like to read more about DNS entropy here’s an excellent reference (in pdf) from DomainTools.

To use this data in Splunk, you can search or alert by referencing our new fields. E.g.:

index=myIndex sourcetype=infoblox:dns entropy>3.14 OR relative_entropy>2.4

In your system you may want to track length and entropies over time and adjust your searches accordingly.

Good luck!

Please check us out at Cribl.io and get started with your deployment. If you’d like more details on installation or configuration, see our documentation or join us in Slack #cribl, tweet at us @cribl_io, or contact us via hello@cribl.io. We’d love to help you!

Enjoy it! — The Cribl Team

Get Cribl LogStream Now!

Return to Cribl Blog

Additional Reading

Better, Faster, Stronger Network Monitoring: Cribl and Model Driven Telemetry

Ryan Conway Apr 9, 2024

Cribl Search Now Supports Email Alerts For Your Critical Notifications!

Perry Correll Apr 4, 2024

Product Portfolio

Cribl Stream

Cribl Edge

Cribl Search

Cribl Lake

Cribl.Cloud

AppScope

Use Cases

Integration

Industries

Resources

Events & Webinars

Learning

Tools & Pricing

Download Library

Customer Stories

Customer Experience

Learning

Try Your Own Cribl Sandbox

About Cribl

Cribl Newsroom

Leadership

Careers

Cribl for Startups

Contact Us

Using Cribl to Analyze DNS Logs in Real-Time – PART 2

Written by Dritan Bitincka