Products
Product Portfolio

Cribl puts your IT and Security data at the center of your data management strategy and provides a one-stop shop for analyzing, collecting, processing, and routing it all at any scale. Try the Cribl suite of products and start building your data engine today!
Learn more ›

Evolving demands placed on IT and Security teams are driving a new architecture for how observability data is captured, curated, and queried. This new architecture provides flexibility and control while managing the costs of increasing data volumes.
Read white paper ›

Cribl Stream

Cribl Stream is a vendor-agnostic observability pipeline that gives you the flexibility to collect, reduce, enrich, normalize, and route data from any source to any destination within your existing data infrastructure.
Learn more ›

Vodafone Case Study

Vodafone Dials up Business Insights with Cribl Stream
Read Case Study ›

Cribl Edge

Cribl Edge provides an intelligent, highly scalable edge-based data collection system for logs, metrics, and application data.
Learn more ›

SpyCloud Edge Story

Listen to how SpyCloud uses Cribl Edge at scale.
Watch Video ›

Cribl Search

Cribl Search turns the traditional search process on its head, allowing users to search data in place without having to collect/store first.
Learn more ›

How Cribl Search Can Save You From Drowning in a Deluge of Data
Read Blog ›

Cribl Lake

Cribl Lake is a turnkey data lake solution that takes just minutes to get up and running — no data expertise needed. Leverage open formats, unified security with rich access controls, and central access to all IT and security data.
Learn more ›

Navigating the future of IT and Security Data management white paper
Read white paper ›

Cribl.Cloud

The Cribl.Cloud platform gets you up and running fast without the hassle of running infrastructure.
Learn more ›

Cribl.Cloud Solution Brief

The fastest and easiest way to realize the value of an observability ecosystem.
Read Solution Brief ›

Cribl Copilot

Cribl Copilot gets your deployments up and running in minutes, not weeks or months.
Learn more ›

Cribl Copilot

Your Trusted AI Advisor for Deploying, Configuring & Troubleshooting.
Read blog ›

AppScope

AppScope gives operators the visibility they need into application behavior, metrics and events with no configuration and no agent required.
Learn more ›

Sandbox

Launch an AppScope Sandbox today!
Launch Now ›
Solutions
Use Cases

Explore Cribl’s Solutions by Use Cases:

Supercharge Security Insights ›

Accelerate Cloud Migration ›

Avoid Vendor Lock-in ›

Agent Consolidation ›

Slash Storage Costs ›

Free Up Space for High-Value Data ›

Route From Any Source To Any Destination ›

Immediate Access to Archived Data ›

Replay Data from Low-Cost Storage ›

Reduce Log Volume & Pay Less for Infrastructure ›
Integration

Explore Cribl’s Solutions by Integrations:

Amazon ›

CrowdStrike ›

Elastic ›

Exabeam ›

Google ›

Microsoft ›

Splunk ›

Wiz ›

View All Integrations ›

Seamless Integrations for Your Observability Data
Learn More ›
Industries

Explore Cribl’s Solutions by Industry:

AIOps ›

Financial Services ›

Healthcare ›

Managed Security Services ›

Manufacturing and Logistics ›

Media and Entertainment ›

Public Sector ›

Retail ›
Resources
Resources

Resource Library ›

Documentation ›

Guides ›

AppScope Docs ›

Blog ›

Glossary ›

Podcasts ›

Telemetry 101

Understanding the Basics of Telemetry and Its Benefits
Learn More ›
Events & Webinars

Events ›

Webinars ›

CriblCon24
Watch On-Demand ›

July 31 | 10am PT / 1pm ET

Navigating the Data Current Report: Transforming IT & Security Operations in 2024
Register ›
Learning

Try the Sandboxes ›

Self Guided Trials ›

Cribl University ›

Cribl Community ›

Cribl Curious Forum ›

What is Observability? ›

Try Your Own Cribl Sandbox

Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›
Tools & Pricing

Download Library ›

Past Releases ›

Pricing Plans ›

Stream ROI Calculator ›

Download Library

Download Cribl’s suite of products for free to get started.
Download ›
Customers
Customer Stories

Get inspired by how our customers are innovating IT, security and observability. They inspire us daily!
Read Customer Stories ›

Sally Beauty Holdings

Sally Beauty Swaps LogStash and Syslog-ng with Cribl.Cloud for a Resilient Security and Observability Pipeline
Read Case Study ›
Customer Experience

Support & Success ›

Professional Services ›

Service Delivery Partners ›

Documentation ›

AppScope Docs ›

Professional Services

Check out our new Professional Services offering.
Learn More ›
Learning

Try the Sandboxes ›

Self Guided Trials ›

Cribl University ›

Cribl Community ›

Cribl Curious Forum ›

Try Your Own Cribl Sandbox

Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›
Company
About Cribl

Transform data management with Cribl, the Data Engine for IT and Security
Learn More ›

Cribl Corporate Overview

Cribl makes open observability a reality, giving you the freedom and flexibility to make choices instead of compromises.
Get the Guide ›

Cribl Newsroom

Stay up to date on all things Cribl and observability.
Visit the Newsroom ›

Press Releases

Read our most recent press releases.
Recent Press Releases ›

Leadership

Cribl’s leadership team has built and launched category-defining products for some of the most innovative companies in the technology sector, and is supported by the world’s most elite investors.
Meet our Leaders ›

Careers

Join the Cribl herd! The smartest, funniest, most passionate goats you’ll ever meet.
Learn More ›

Cribl Named to the Inc. 5000 List of Fastest Growing Private Companies
Learn More ›

Cribl for Startups

Whether you’re just getting started or scaling up, the Cribl for Startups program gives you the tools and resources your company needs to be successful at every stage.
Learn More ›

Contact Us

Want to learn more about Cribl from our sales experts? Send us your contact information and we’ll be in touch.
Talk to an Expert ›

Try Cribl Talk to an expert

When All I’ve Got is a Hammer

March 30, 2020

Categories: Learn

Back To Blogs

My story is not unique. Years ago, I saw the benefit of capturing system logs across all of my systems and being able to search through them. It started out pretty inexpensive: a few thousand dollars for a reasonable Splunk license, and a box to run it on. I was using the data to troubleshoot problems and maybe see some trends, so I had a fairly easy time balancing performance and retention – I didn’t need to retain most of my data for very long. The system got popular, and I needed to ingest more and more data. Finally, the security and compliance teams caught on, and suddenly I had to retain all of this data.

My logging infrastructure costs started exploding, and my performance got steadily worse – I went from retaining a month of data to retaining 13 months. Moreover, I was now ingesting a lot of data that was never used in the analysis, simply to get it retained. That felt like a waste of my available capacity. But why was I retaining all of that data in my analysis system? Because I really didn’t have another choice at the time. As the old saying goes, when all you have is a hammer, everything looks like a nail. Log Analytics was my hammer, and every requirement looked like a nail.

Over the last 10 years, with the move to cloud services, such as AWS, the cost of archival storage has been racing to the bottom, and now with services like Glacier Deep Archive, the cost of storing a Terabyte of data is less than $1/month. It’s time to separate retention from analysis and serve both needs better. Analysis and Troubleshooting of system problems benefit greatly from the fast response you get from not having “extra” data in the system. Retention benefits from having a well organized archival capability, but the data does not have to be at your fingertips. In all of my years in IT, I have *never* seen a request for year old log data that had to be available *right now*…

A Different Way

By inserting some sort of routing mechanism into the environment, as seen on the right, instead of feeding all of the data that has to be retained through the log analytics system, the entire “firehose” worth of logs can be retained by sending it to low-cost archival storage, like Amazon S3, Glacier/DeepArchive or Azure Block Blobs. This enables filtering the data for the log analytics system to only the data needed for analysis. If that routing system has the capability to transform, enrich and/or aggregate the data, now simple metrics about the data can be fed to the https://cribl.io/blog/the-observability-pipeline/data lake for use in normal business reporting. This separates out the data retention requirement from all of our analysis needs.

Cribl White Paper: 6 Techniques to Control Log Volume

Learn how to cut costs with Cribl LogStream.

Download Now

Benefits

Cost

Let’s take an example – say we’ve got an environment that is ingesting 2TB a day, and getting a roughly 2:1 compression ratio in storing that data, leading to about 1TB/day of storage consumption. Add to that a compliance requirement of 18 months of retention. Averaging 30 days in a month, that comes out to about 540TB worth of storage. In the table below, there are 4 scenarios for retention management, and you can see the drastic difference this can make in cost of infrastructure: An aggressive move from managing all of the data in General Purpose (GP) Elastic Block Store (EBS) volumes to one that keeps a minimum (30 days) in EBS, while moving the rest to Archival storage (with automated lifecycle management dealing with the migration between the “tiers”) reduces storage costs from $54K/month to $4.3K/month – that’s a net savings of almost $600K/year, and this is only looking at the storage costs, not taking into account compute resources, software licensing, etc. While you could go even more aggressive and go directly to Deep Archive for all retention, this scenario balances retention cost with the likelihood that more recent data is more likely to need to be retrieved – Deep Archive’s retrieval time is considerably longer than S3…

	EBS		Archival Storage
	General Purpose	Hard Drive	S3	Glacier	Deep Archive
List Cost per GB/Month	$0.10	$0.05	$0.022	$0.004	$0.00099	Monthly Cost
Retain all in EBS GP Volumes	540	0	0	0	0	$54,000.00
Retain 90 days in EBS GP Volumes, 13 months in EBS HDD volumes.	90	450	0	0	0	$29,250.00
Retain 30 days in EBS GP Volumes, Remainder in S3	30	0	540	0	0	$14,880.00
Retain 30 days in EBS GP Volumes, 30 days in S3, 60 days in Glacier and the remainder in Deep Archive	30	0	30	90	420	$4,435.80

The example is based on cloud services, but a similar approach can be followed in an on-premises data center environment. The economics are a bit harder to model, due to differences in hardware and operating costs, but it is possible using standard storage options.

Performance

The simple act of eliminating a massive amount of the data in a given analysis system will likely yield immediate, measurable improvement in the performance of queries, especially any queries that are not properly constrained by time ranges (we all have users who do this and then don’t understand why their queries take forever). With a much smaller footprint to work with, one can optimize the storage for its purpose. For example, moving to provisioned IOPS EBS volumes (the cost addition is far less prohibitive on 30 TB than it is on 540 TB).

Data Quality

Developers are incentivized to put every field in the log entry – it’s far more expensive to have to go back and add logging statements than to include it initially. As a result, a lot of the log entries that applications generate have empty fields or extraneous data. Without the need to retain the data in its original form, we can easily remove fields we don’t need, drop empty fields, and even aggregate repeated log entries (great examples are port status lines on switches or Windows Event Log login notifications). Cleaner data leads to quicker resolution and cleaner metrics.

How To Do It

At the heart of this approach is by building what we at Cribl call an Observability Pipeline. This can be built from open source components (see Clint Sharp’s post on this topic for details), but we believe that our product, Cribl LogStream, is the best way to do this – it can be a drop-in replacement for components in your logging environment like Splunk Heavy Forwarders or LogStash instances, and configuring it to do exactly this just takes a few clicks of the mouse.

The fastest way to get started with Cribl LogStream is to sign-up at Cribl.Cloud. You can process up to 1 TB of throughput per day at no cost. Sign-up and start using LogStream within a few minutes.

Blog

Preventing Friction With an Impactful Security Champions Program

Blog

From Necessity to Opportunity: The Customer Push for SIEM Options

Blog

Securing the Foundation of Cribl Copilot

Try Your Own Cribl Sandbox

Experience a full version of Cribl Stream and Cribl Edge in the cloud with pre-made sources and destinations.

Launch Now

Product Portfolio

Cribl Stream

Cribl Edge

Cribl Search

Cribl Lake

Cribl.Cloud

Cribl Copilot

AppScope

Use Cases

Integration

Industries

Resources

Events & Webinars

Learning

Tools & Pricing

Download Library

Customer Stories

Customer Experience

Learning

Try Your Own Cribl Sandbox

About Cribl

Cribl Newsroom

Leadership

Careers

Cribl for Startups

Contact Us

When All I’ve Got is a Hammer

A Different Way

Cribl White Paper: 6 Techniques to Control Log Volume

Benefits

Cost

Performance

Data Quality

How To Do It

Blog

Preventing Friction With an Impactful Security Champions Program

Blog

From Necessity to Opportunity: The Customer Push for SIEM Options

Blog

Securing the Foundation of Cribl Copilot

Try Your Own Cribl Sandbox

So you're rockin' Internet Explorer!