Cribl Guard’s latest model: Cribl-Privacy 1.5 with higher recall at the same footprint

When we introduced cribl-privacy-1.0, our custom model for Cribl Guard’s background detection capability, we made the case that telemetry needs a different kind of privacy model: one built for the semi-structured, high-throughput shape of real machine data rather than polished natural-language text. Today we're releasing cribl-privacy-1.5, the next version of the in-house transformer powering Cribl Guard. It catches significantly more sensitive data and all in the same memory footprint as before.

This post walks through what changed, why we prioritized recall, how we held the footprint steady, and how to start taking advantage of Cribl-Privacy 1.5 today.

Cribl Guard’s latest model: Cribl-Privacy 1.5 with higher recall at the same footprint - img 1

Figure 1: Cribl-Privacy 1.5 vs. Cribl-Privacy 1.0 model comparison across F1 score, precision, and recall.

What changed

We measured v1.5 against our internal benchmark grounded in the real world: a diverse mix of production logs, vendor formats, and machine-generated records. Our upgraded privacy model posts meaningful gains across the board in ways that reflect how customers move data in production:

F1 score: 78.0% → 85.5% (+7.5pp)
Recall: 72.6% → 84.1% (+11.5pp)
Precision: 84.2% → 87.0% (+2.8pp)
False positives: down 7.6%
False negatives: down 41.9%

Prioritizing recall

One obvious weakness of the v1.0 model was its recall, which measures the percentage of positive cases that are false negatives. For our customers, missing a sensitive value such as a token in a serialized payload, an account number in a delimited field, or PII hiding in a vendor-specific format, is more costly than flagging a field that turns out to be benign. False negatives leak data downstream; false positives are recoverable inside a redaction pipeline. And the recall gain didn't come at a noise cost: false positives dropped 7.6%. cribl-privacy-1.5 catches more sensitive data and produces less noise in the operational fields analysts rely on for investigations.

Same footprint, same throughput profile

Our goal wasn't to grow the model, it was to make the existing footprint smarter. cribl-privacy-1.5 is sized to run in-stream, under the throughput requirements of production telemetry pipelines, with the same CPU-conscious inference characteristics as v1.0. For our customers, that means a seamless drop-in upgrade: better accuracy, with no change to compute budget, throughput, or memory footprint.

Available now

cribl-privacy-1.5 ships with Cribl 4.17.1 and is rolling out to Cribl Guard customers now. If you're already running Guard on 4.17.1 or later, the upgrade is automatic. If you're not, this is a good moment to take a look as the gap between general-purpose privacy filters and a model trained specifically on telemetry just got wider.

Conclusion and next steps

cribl-privacy-1.5 delivers higher recall, stronger overall accuracy, and fewer false positives and false negatives, all while preserving the same lean footprint and in-stream performance profile customers rely on today. It’s purpose-built for the messy, high-volume reality of telemetry, not the polished world of natural-language text.

If you’re a Cribl Guard customer, confirm you’re on 4.17.1 or later so you’re already benefiting from cribl-privacy-1.5, then compare your detection and false-positive patterns before and after the upgrade in your own data. If you’re still evaluating Cribl Guard, now is an ideal time to give it a try and see how a telemetry-native privacy model changes what you can safely route downstream.

Connor Swanson

Staff Software Engineer - AI Research

Connor Swanson is a Staff Software Engineer - AI Research at Cribl.

View all posts

Nikhil Mungel

Head of AI R&D

Nikhil is based in San Francisco, building distributed systems and AI teams at SaaS companies for over 15 years. His background spans AI, observability platforms, developer ecosystems, and high-scale consumer social products, with leadership roles at Substack, Splunk, ThoughtWorks, and most recently, Cribl where he currently serves as the Head of AI R&D.

View all posts

Cribl, the AI Platform for Telemetry, empowers enterprises to manage and analyze telemetry for both humans and agents with no lock-in, no data loss, no compromises. Trusted by organizations worldwide, including half of the Fortune 100, Cribl gives customers the choice, control, and flexibility to build what’s next.

We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.

Cribl Guard’s latest model: Cribl-Privacy 1.5 with higher recall at the same footprint

What changed

Prioritizing recall

Same footprint, same throughput profile

Available now

Conclusion and next steps

Authoring Cribl Apps: How we built a Lookup File Manager

Building AI-ready data pipelines: a complete guide to building and analyzing your data flow

Designing scalable data pipeline architectures for the AI era

Ready to get started?

Products & Services

Learning & Resources

Company

Get Started

NewsLetter

4.7