“ULTIMATELY, WE ALSO ENDED UP AS A SELF-SERVICE MODEL FOR SOME OF OUR DEV TEAMS. THEY CAN DO A LOT ON THEIR OWN, SO NOW I DON’T GET A JIRA TICKET TO SET UP ANY NEW SOURCES OR DESTINATIONS.”
“OUTAGES ARE COSTLY FOR US AS AN E-COMMERCE ORGANIZATION — CRIBL STREAM ALLOWS OUR ENGINEERS TO USE THEIR TIME AND EXPERTISE ON MINIMIZING DOWNTIME AND OTHER IMPORTANT TASKS.”
“CRIBL STREAM PAID FOR ITSELF, CRIBL EDGE PAYS FOR ITSELF.”
A few years ago, iHerb set out to build a real-time stream processing system for their logging data. However, developing an in-house observability pipeline consumed a lot of engineering resources and left them with a lot of technical debt, making the solution costly and unmanageable for the long term.
As an online retailer, iHerb was processing 2–3TB of weblogs daily, so configuring sources and keeping systems up to date was eating up valuable engineering time. Bob Chen, the organization’s Senior Director of Infrastructure Engineering, mentioned the other factors that led to the switch from building their own tool to using Cribl Stream:
“We wanted an easy-to-use tool without having to tap into a UX team. A good API interface was critical, as was support for multiple logging sources and destinations. It turned out Cribl Stream could provide all that, and it was easy to implement and deploy, so we made the choice to put our build on hold.”
The decision to switch from open source to Cribl Stream came at just the right time, as the amount of data iHerb processed daily, doubled. Their data now flows seamlessly from sources like Kafka and Fluentd to destinations like S3, Loki, Elastic Stack (Elasticsearch, Logstash, Kibana), and Splunk.
All that data goes to S3 for long-term storage, with most logs going to Elastic for short-term (<3 months) storage. Some selected logs get sent to Loki for retention periods between 3–6 months. iHerb’s Security department provides guidelines to Bob and his team regarding which data gets sent to Splunk for security use cases.
“We process a lot of data each day, and we can’t afford to skip even a few KB of it — we need every log entry to troubleshoot incidents and identify other issues. Using Cribl Stream helps us avoid losing any of the critical data we need.”
With the increase in cybersecurity incidents in recent years, securing sensitive data is more important than ever. iHerb leverages Cribl Stream to mask sensitive patterns using redaction, hashing, or randomization. These functions allow Bob and his team to mask PII for the security team.
If a security incident does occur, Cribl Stream’s Replay feature allows them to selectively re-ingest data from S3 back into their systems of analysis. And going forward they’ll be able to use Cribl Search, which allows you to search data in place (ie before ingesting into analytics tools), to find investigation-related context from across various S3 buckets.
Many teams leverage Elastic for log analysis, but it’s also a popular choice for handling metrics. iHerb uses Cribl Stream to query and aggregate log counts and other statistics based on parameters like cluster, namespace, and source. The results are then routed from Elastic into an intuitive, user-friendly Grafana dashboard, enabling them to gain valuable insights into system performance, identify trends, and troubleshoot issues effectively.
Since successfully implementing Cribl Stream, Bob and team have also used Cribl Edge to implement a couple thousand edge nodes. Cribl Edge is a centrally managed, edge-based data collection system.
Kubernetes, an integral part of iHerb’s infrastructure, is notoriously difficult to monitor and often limited by the observability of the system. iHerb deploys Kubernetes with Edge already bootstrapped to collect application logs and system metrics, giving them visibility into Kubernetes microservices.
“The combination of Cribl Stream and Edge is a lifesaver. The speed, accuracy, and ability to manipulate logs is unparalleled.”
“We got our Cribl Stream POC up and running within a week. We tested as many scenarios as we could, pushed a bunch of our logs through a test environment, then made the purchase and got our production environment going remarkably quickly.”
Cribl makes open observability a reality for today’s tech professionals. The Cribl product suite defies data gravity with radical levels of choice and control. Wherever the data comes from, wherever it needs to go, Cribl delivers the freedom and flexibility to make choices, not compromises. It’s enterprise software that doesn’t suck, enables tech professionals to do what they need to do, and gives them the ability to say “Yes.” With Cribl, companies have the power to control their data, get more out of existing investments, and shape the observability future. Founded in 2017, Cribl is a remote-first company with an office in San Francisco, CA. For more information, visit www.cribl.io or our LinkedIn, Twitter, or Slack community.