x

A Supercharger for Log Data

Written by Bryan Turiff

August 25, 2020

I have always been fascinated by new technology and none more so than electric cars. The idea of never going to a gas station again seemed like a dream.  I thought it would be great to own a car that was nearly maintenance free. The environmental advantages of an electric car also intrigued me. I’m going to be talking about cars for awhile, but trust me, if you rely on log data for Security or IT Operations, it’s going to make a lot of sense.

Tesla, for me at least, represents the pinnacle of electric vehicles, and about 5 years ago, I started researching them.  The Model S was the premium model for Tesla at the time and boasted a range of about 200–250 miles on a full charge, based on model and options.  

Range Anxiety

200 miles covers most of what I need to do in my daily life.  I work from home and even if I am running errands all weekend, 200 miles would typically cover where I need to drive. On occasion, however, 200 miles may not be enough – taking a road trip, attending an out-of-town activity for one of my kids or anything else that pushed that limit.  This gave me anxiety – what if I ran out of charge in the middle of nowhere – it’s not like I can get a ride to a gas station and buy a gallon or two of gas to revive the car.  Suddenly my excitement about owning an electric car was significantly dampened.  It didn’t seem like a practical car to own anymore.  

The Dawn of a Supercharger Network.

Tesla knew that people were worried about the ability to drive longer distances in their cars.  In 2013, Tesla began building a network of charging stations across the United States to make it easier for their customers to travel the country without worrying about stalling out on a remote roadside.  By 2015, they had several hundred charging stations and boasted coverage to most of the continental US. The key word here is most:

Even if I didn’t live in the conspicuous gap in the middle of the country, the Supercharger network didn’t give you a lot of flexibility.  Unless you are travelling from one major city to another, you might have to drive far out of your way to make this map a reality. Add to this the cost of a Tesla Model S (starting around $80K), and my dream of owning an electric car was on hold.

Longer Range + More Stations = Less Anxiety

Flash forward to last year (can we all just remember how great 2019 was?)  Tesla has invested a great deal in their supercharger network, and now has more than 1,900 stations worldwide.  Their Model 3 car has a top range of 310 miles and is considerably more affordable (starting at about $38K).  Charging times are significantly shorter and stations have a lot more stalls.  Suddenly, the dream came back into focus.  There are enough charging stations that you can head out on the road without carefully planning your route ahead of time.  You can decide later where you want to go and how you will charge your car.  This is what the map of North America looks like today:

When you factor in the longer battery ranges, owning a Tesla and driving it anywhere you want becomes much more practical. 

For a lot of enterprises, adding new log data sources for IT Ops or Security teams to analyze brings just as much anxiety as contemplating an electric vehicle back in 2013.  Couple this challenge with the organic growth of machine data (IDC estimates 25-30% data growth YoY), and organizations aren’t sure if they have room in their existing license agreements to handle more.  

Perhaps less obvious, but more impactful, the cost of the infrastructure to store log data is growing out of control.  For more on this, read our blog article, “Why Log Systems Require So Much Infrastructure.” The demands of the business require analyzing more and new types of log data.  At the same time, budgets are flat or, given the current economic climate, shrinking.  How can you analyze more data without ballooning your costs?

Cribl LogStream to the Rescue

Just like Tesla did by expanding its supercharger network and improving battery range, new tools like LogStream seek to reduce the anxiety caused by trying to onboard more and new types of log data.  These tools, along with some best practices, can help you analyze more data without getting into trouble with your finance department.  You can add flexibility to your logging efforts by changing how you address retention, and where you store log data long-term. Here are some approaches that will allow you to improve the efficiency and analytical power of your log system environments.

Eliminate the Noise

Log data comes in all shapes and sizes. Log schemas were designed to serve different purposes. What might be right for an SRE to analyze, with a goal of making their infrastructure more efficient, may be a complete mismatch for the requirements of a security engineer trying to protect their organization.  

LogStream helps you strip out the noise – anything that doesn’t provide analytical value today.  You can cut up to 50% of total data volume (often 25% within a couple of weeks) by trimming duplicate fields, null values, and anything that doesn’t seem interesting to you today, without compromising whether you can access it later if needed.

Volume Reduction screen shot

Send High Value Data to a Logging Tool and the Rest to Cheap Storage

One of the greatest sources of logging anxiety is whether or not to keep data.  Will you ever need to analyze a certain type or aspect of data?  How long can you afford to store it?  LogStream helps you put that worry to rest by allowing you to park full-fidelity data in a low-cost storage destination like S3 or a file system so that you can decide later whether or not you need it. Watch our webinar on Data Collection to see why it’s 99% cheaper to use a non-indexed  storage destination. 

Our Data Collection feature  allows you to process and replay data to any supported analytics tool at a later date as needed.  After sending a copy of everything to  low-cost storage, you can route portions of that data to tools of your choice. This is especially useful for analyzing data after the fact such as conducting a breach investigation.

If you are like most organizations, you have seen a tremendous increase in data over the past several years.  It may have made sense to ingest everything into an indexed logging tool and use that for data retention in the past, but that is likely too expensive now. Using a tool like LogStream can help you decide the best place to park your data.  If you aren’t sure whether you may need to analyze it later, don’t worry – you’ve already stored a copy of everything in cheap storage. Send what is most interesting to the best tool – LogStream can format logs and get them into any tool you need.

Once you have finished analyzing logs in your logging system, get rid of them.  It can cost as much as 100X more to store them there compared to S3 infrequent access, and again, you have a copy stored if you end up needing it later.

Sample, Trim, and Aggregate Logs to Onboard New Data Types

If you want to analyze new data types, but you’re worried about breaking your daily ingestion limits, filter down the data to make it fit.  If you are interested in tracking how a handful of metrics impact your organization, consider using Aggregation. This turns logs into metrics and can trim data volume by a factor of 1,000 to 1.  LogStream can also downsample incoming log data. Using redundancy filters, time and frequency limits, and other flexible criteria it can dramatically reduce outgoing volume.

A global financial services company wanted to analyze DNS logs, but the 1 TB of data per day was more than they could manage under their existing license.  LogStream enabled them to enrich and then filter logs using a top domains list. They dropped logs from trusted domains, which they  deemed uninteresting from an analytical perspective. This reduced total ingest from 1 TB to about 50 GB of log data a day. Capacity anxiety solved, they were back on the road to a more secure enterprise.

Wrapping Up

It is reasonable to feel anxious about getting more data into your logging tools for analysis.  The growth of data is accelerating. You want to have the flexibility to add new types without breaking the bank. LogStream eases that anxiety by implementing some of the best practices I mentioned above.  Don’t take my word for it – try LogStream for yourself.  Download a copy  and get started today. You can process up to 5 TB of data a day at no cost.

You might want to first take the product for a test drive in our interactive sandbox – this uses the actual product and guides you through a tour of its most popular uses. The sandbox accelerates the value you get from using LogStream. 

After receiving the latest in a string of expensive maintenance bills for my previous car, I finally did buy a Tesla. I never worry about not being able to make it to the next charge. We hope LogStream can help you cover more ground with less anxiety.

Questions about our technology? We’d love to chat with you.