Screen Shot 2021-03-31 at 11.32.47 PM

How AppScope helped resolve a DNS problem

Last edited: April 1, 2021

This is a short blog post about how we used AppScope to identify and resolve a DNS-related problem reported by one of our customers … and it is a fact that it’s always a DNS problem, except when it isn’t :). The problem description went something like this. The customer has a LogStream deployment to ingest data from S3. Everything works, but they are seeing the deployment make about 1,000 DNS queries per second. Their DNS server admin shared that Cribl Stream was trying to resolve just one domain: sqs.<region>.amazonaws.com. (The quick solution for this customer was to run use nscd to cache the DNS responses at the OS level.)

With DNS logs at hand we’d be ready to tackle this problem. The solution could be simple enough. Just cache the DNS requests for some period of time at the application level and move on to sexier problems, right? However, the idea of caching DNS requests at the application layer felt wrong; but asking customers to install and maintain nscd felt even more wrong.

We started with reproducing the problem internally, and hit an immediate challenge. How to get visibility into the DNS requests that LogStream is making? Here’s where AppScope came in super handy. We simply scoped it and were able to see not only that LogStream was making a ton of DNS requests to resolve the SQS endpoints, but more importantly it was establishing new TCP connections and making HTTP requests just as frequently.

Get started by downloading AppScope 

Armed with this information, we started to look one step further and found out that the AWS SDK does not enable HTTP connection keep-alive by default. This resulted in all API requests establishing new connections – WTF?!?  The fix for this problem was even simpler than trying to implement DNS caching at the application layer. 

The ascii chart below shows the number of DNS requests over time

dns-scope

Having the level of data granularity provided by AppScope, we got better visibility than we would with only DNS logs. We were able to identify and fix the actual root cause of the problem, rather than just addressing a symptom.

If you want to read more about DNS resolution please go here or here. To test drive AppScope, explore our online sandbox, AppScope Fundamentals.

Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.

We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.

More from the blog

get started

Choose how to get started

See

Cribl

See demos by use case, by yourself or with one of our team.

Try

Cribl

Get hands-on with a Sandbox or guided Cloud Trial.

Free

Cribl

Process up to 1TB/day, no license required.