x

Using LogStream to Detect Data Exfil Over DNS Logs in Real-Time

Written by Dritan Bitincka

December 3, 2018

Update: Part 2 is now here

The massive data breach from Marriot’s Starwood reservation system got me thinking about various data exfiltration techniques, including over DNS. Probably not related to this breach, but it was a completely random thought and I realized that Cribl LogStream can help security practitioners and threat hunters here.

As you may know, data exfiltration is a well known adversary attack tactic. Mitre has a whole lot of content on its MITRE ATT&CK™ knowledge-base dedicated to exfiltration techniques. In an exfiltration scenario, data from malware or spyware infected machines is sent to a remote destination that acts as command and control (C2/C&C) server. To keep the communication alive for as long as possible “covert” or alternate communication channels and protocols are typically employed. One of the most common ones is DNS. The fundamental idea is to perform the exfiltration between the two parties over DNS requests. The (malware infected) client sends out requests, for example for bXlwYXNzd29yZA==.foobar.com, and by virtue of its distributed nature the DNS infrastructure will propagate them to that domain’s authoritative name servers, which are owned by the adversary. The servers will reply but at that point the adversary has already exfiltrated and acquired bXlwYXNzd29yZA==. The client can make more than one request and the remote server can easily stitch exfil’d fragments into a full dataset. In most cases, base64 encoding is used to exfiltrate as it allows for encoding a wider variety of formats, including binary.

There are several ways to minimize damage from this, such as limiting internal machines to only talk to internal DNS servers that have been hardened, but DNS is such a critical, pervasive and widespread service that complete protection via hardening or  lockdown may not be guaranteed.

The least that an organizations can do is to collect enough data to see if are signs of exfiltration activity. That basically means one thing: Log ALL your DNS queries!

dnsCribl

Detecting Base64 encoded fields with Cribl in real-time


In Cribl, detecting whether part of a string is base64 encoded can be done using a regex and our native base64 decoder function. Let’s take a look:

1. Ensure that DNS data passes thru Cribl. This may include, but it’s not limited to sources such as Windows DNS, Bro DNS activity, Infoblox, Cisco Umbrella, Amazon Route 53  etc. If this data is coming into Splunk, you’re already covered and LogStream can install as an app. If you have this data elsewhere you can either send it directly to Cribl using one of the available methods or if it’s in AWS S3 or Kinesis Streams you can use our AWS Lambda function. In this example we’re using DNS logs from Infoblox. Notice base64 encoded part in red.

base64-encoded

2. Extract the base64 encoded part of query from the data. The exact extraction will depend on your data and in some cases you may simply need to target a query field if it’s already extracted. Here’s how it plays out for sourcetype=='infoblox:dns', where extraction is based on raw:

query:\s(?<encoded_part>(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=))

base64-extract

Note that encoded_part field will become an index-time field in Splunk. If that is not desirable, you can prepend two underscores to it: __encoded_part.

3. Evaluate a new field whose value is the base64 decoded payload using Cribl’s native C.decode.base64(). You can also add a field that is simple or flag-like (e.g., potential_exfil='yes').

base64-eval

Another interesting DNS query characteristic that can be extracted in real-time is the  length of the domain-name or that of each label. There are length limits imposed by the protocol and the closer they get to the maximum the more suspicious they are. In our are we’re extracting the length of that label as query_label_len.

Decoded Results


Notice how query_label_len,  potential_exfil and decoded_payload fields are only added to the events that Cribl was able to base64 decode:

base64-decoded

To use this data in Splunk, you can search or alert by referencing our new fields. E.g.:

index=myIndex sourcetype=infoblox:dns potential_exfil::yes OR decoded_payload::*

In your system can also track the length of the label (or even domain-name’s) over time and adjust and adapt your searches accordingly.

The fastest way to get started with Cribl LogStream is to sign-up at Cribl.Cloud. You can process up to 1 TB of throughput per day at no cost. Sign-up and start using LogStream within a few minutes.

Questions about our technology? We’d love to chat with you.