AdobeStock_279501679-2

Data Overload: Why Companies Collect Too Much Data and Pay the Price

Last edited: December 11, 2023

In the US, a recurring news topic is the state of the federal budget – and if we’ll get one signed. Government budgets have hundreds of thousands of line items; each bickered over to gain or lose political capital with one group or another. However, most government budgets aren’t up for debate. Only about 30% of the US federal budget is discretionary or flexible. Nearly two-thirds, or 63%, is mandatory spending required due to prior commitments.

What’s interesting about this is that data budgets often work the same way. Only a tiny percentage of what you collect and store is discretionary. Most of that data, and I’m willing to guess it’s a number higher than 63%, is stuff you’re collecting because you don’t have a choice. And you’re not alone. According to a recent survey from IDC, 45.2% of surveyed companies say they’re collecting too much observability data.

There are a few reasons companies are collecting too much data. First, you have to, not because of some compulsion but because you’re legally required to for regulatory or compliance purposes. You may also need it to investigate something in the future, like a security breach.

Another reason you’re collecting too much data is because your current crop of vendor tools forces you to. Part of this is ignorance or indifference, while the other is opportunistic. Vendors with tools that generate a lot of data – and don’t charge based on ingest or volume – aren’t innocent. They’re creating negative externalities, meaning they’re screwing something up as a side effect. In this case, they’re driving up your storage and retention costs, often with little to show. Think of the APM tool spewing out traces that no one will look at, but you’re still compelled to store them just in case.

On the other hand, opportunistic vendors charge by some ingest-based metric, like average daily ingest, events per second, or even by the workload generated from accessing the volumes of data you’ve stored in their platform. Returning to the budget example, these vendors tax you for using a product you’re already paying for.

Whether opportunistic or indifferent, vendors mandating how much data you collect and how you store it erode your options when exploring new tools and use cases. These data silos limit how data can be used and shared in the future.

Managing a budget, data or otherwise, requires making choices. A federal budget with two-thirds already allocated doesn’t leave much room for choice. Likewise, security and observability tooling that doesn’t allow you to choose your data – where it is stored, what format it is in, and how it can be accessed – doesn’t give you the flexibility you need.

Evidence:

Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.

We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.

More from the blog

get started

Choose how to get started

See

Cribl

See demos by use case, by yourself or with one of our team.

Try

Cribl

Get hands-on with a Sandbox or guided Cloud Trial.

Free

Cribl

Process up to 1TB/day, no license required.