Cribl puts your IT and Security data at the center of your data management strategy and provides a one-stop shop for analyzing, collecting, processing, and routing it all at any scale. Try the Cribl suite of products and start building your data engine today!
Learn more ›Evolving demands placed on IT and Security teams are driving a new architecture for how observability data is captured, curated, and queried. This new architecture provides flexibility and control while managing the costs of increasing data volumes.
Read white paper ›Cribl Stream is a vendor-agnostic observability pipeline that gives you the flexibility to collect, reduce, enrich, normalize, and route data from any source to any destination within your existing data infrastructure.
Learn more ›Cribl Edge provides an intelligent, highly scalable edge-based data collection system for logs, metrics, and application data.
Learn more ›Cribl Search turns the traditional search process on its head, allowing users to search data in place without having to collect/store first.
Learn more ›Cribl Lake is a turnkey data lake solution that takes just minutes to get up and running — no data expertise needed. Leverage open formats, unified security with rich access controls, and central access to all IT and security data.
Learn more ›The Cribl.Cloud platform gets you up and running fast without the hassle of running infrastructure.
Learn more ›Cribl.Cloud Solution Brief
The fastest and easiest way to realize the value of an observability ecosystem.
Read Solution Brief ›Cribl Copilot gets your deployments up and running in minutes, not weeks or months.
Learn more ›AppScope gives operators the visibility they need into application behavior, metrics and events with no configuration and no agent required.
Learn more ›Explore Cribl’s Solutions by Use Cases:
Explore Cribl’s Solutions by Integrations:
Explore Cribl’s Solutions by Industry:
Try Your Own Cribl Sandbox
Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›Get inspired by how our customers are innovating IT, security and observability. They inspire us daily!
Read Customer Stories ›Sally Beauty Holdings
Sally Beauty Swaps LogStash and Syslog-ng with Cribl.Cloud for a Resilient Security and Observability Pipeline
Read Case Study ›Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›Transform data management with Cribl, the Data Engine for IT and Security
Learn More ›Cribl Corporate Overview
Cribl makes open observability a reality, giving you the freedom and flexibility to make choices instead of compromises.
Get the Guide ›Stay up to date on all things Cribl and observability.
Visit the Newsroom ›Cribl’s leadership team has built and launched category-defining products for some of the most innovative companies in the technology sector, and is supported by the world’s most elite investors.
Meet our Leaders ›Join the Cribl herd! The smartest, funniest, most passionate goats you’ll ever meet.
Learn More ›Whether you’re just getting started or scaling up, the Cribl for Startups program gives you the tools and resources your company needs to be successful at every stage.
Learn More ›Want to learn more about Cribl from our sales experts? Send us your contact information and we’ll be in touch.
Talk to an Expert ›
The IT team uses one tool for log analysis and another one for metrics. The security team uses yet another tool for Security Information and Event Management (SIEM), and the development teams have additional tooling for product logs, errors and metrics. Unfortunately, each one of these tools has its own mechanism for ingesting data, and they’re isolated from each other, leading to multiple “agents” being installed on systems just to feed the tools, each of which have their own overhead.
Imagine reducing that agent count down to 1 or 2 and being able to feed *all* of the data enrichment tools from a single pipeline, transforming data as appropriate for each source. This now feeds the network device data to the developers’ tooling, allowing the developers to correlate app errors with the servers switch port flapping, leading to quick resolution. Suddenly, where two tools were reporting different values for specific metrics, they’re now showing the same value, simply because each tool is getting the same data.
Context is King. A lot of the data we get, in the form of logs, is barely useful without context. Take the port flap mentioned above. A flapping port doesn’t matter unless it’s connected to something important, and the log entry for that port flap is not going to tell you what it’s connected to. What if you could add data from your CMDB to that line, like the server that port’s connected to, the application that it runs, and the business process that it supports? Now you’ve got the context you need to understand the impact and respond accordingly.
Or, say you have a huge amount of one kind of log data, but you only care about a subset, based on external information, and ingesting all of it into your analysis system is prohibitively expensive? This is exactly the situation that one of our customers found themselves in: they had too much DNS log data to ingest, but they really only cared about the subset of that data that didn’t match “trusted” domains, so they enriched the data with a list of trusted domains, and filtered out records from those domains, only ingesting the log data they needed for analysis. As a result, this reduced their ingestion requirement by orders of magnitude, making it an affordable approach for them.
Another great use case is adding GeoIP information to the data as it comes in. Sure, you can do that at search time in Splunk, but if you have multiple tools, you have to figure out how to do that in all of them. If you do that lookup before sending it to the downstream systems, it only has to be done once, and all downstream systems benefit from it. Less maintenance and consistent results across the board.
Often, log files contain incredibly valuable information, but it needs to be extracted from the log entry and aggregated to be valuable. Weblog entries, for example, are rarely individually valuable. While what someone is looking for might vary, its usually the metrics about access that matter, not the individual accesses. For example, let’s say you have 1000 lines of weblog data, similar to this:
128.241.220.82 - - [03/Apr/2020:20:30:05 +0000] "GET /static/jquery.js?&JSESSIONID=SD2581716739$SL2122330098FF8932042391ADFF3720110694 HTTP/1.1" 200 2484 "/cart.do?action=view&itemId=EST-16&product_id=MC-SANDISK-MICROSD16GB" "Mozilla/5.0 (iPad; CPU OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.3" 64.66.0.20 - - [03/Apr/2020:20:30:05 +0000] "GET /product.screen?product_id=HS-MONST-NERGY&JSESSIONID=SD1837132548$SL4493168124FF7003251314ADFF2222394401 HTTP/1.1" 404 3818 "/product.screen?product_id=BT-HS-JAWB-ICONTHD" "BlackBerry9300/5.0.0.955 Profile/MIDP-2.1 Configuration/CLDC-1.1 VendorID/102" 12.130.60.4 - - [03/Apr/2020:20:30:03 +0000] "POST /category.screen?category_id=ACCESSORIES&JSESSIONID=SD8687719920$SL6155682857FF6085796020ADFF1246778254 HTTP/1.1" 400 2967 "/product.screen?product_id=CC-T11-ZAGG-FOLIO" "Mozilla/5.0 (iPad; U; CPU OS 4_3_5 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8L1 Safari/6533.18.5" 130.253.37.97 - - [03/Apr/2020:20:30:05 +0000] "POST /product.screen?product_id=BT-SP-JAWB-JAMBOXBIG&JSESSIONID=SD8401052943$SL2691867954FF9065133477ADFF6965824981 HTTP/1.1" 404 722 "/cart.do?action=addtocart&itemId=EST-12&product_id=BA-HTC-REZOUND" "Mozilla/5.0 (iPad; U; CPU OS 4_3_3 like Mac OS X; de-de) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8J2 Safari/6533.18.5" 194.8.74.23 - - [03/Apr/2020:20:30:05 +0000] "GET /cart.do?action=changequantity&itemId=EST-18&product_id=BA-MOPHIE-JUICEPACKPLUS&JSESSIONID=SD8190965089$SL7522258463FF7229085117ADFF6846367911 HTTP/1.1" 200 3758 "/category.screen?category_id=MEMORYCARDS" "Mozilla/5.0 (iPad; U; CPU OS 4_3_5 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8L1 Safari/6533.18.5" 125.17.14.100 - - [03/Apr/2020:20:30:05 +0000] "GET /static/6051.jpg?&JSESSIONID=SD1073290485$SL6642531837FF5469045339ADFF7796274172 HTTP/1.1" 200 846 "/category.screen?category_id=CHARGERS" "Mozilla/5.0 (Linux; U; Android 2.3.4; en-us; T-Mobile G2 Build/GRJ22) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1" 130.253.37.97 - - [03/Apr/2020:20:30:04 +0000] "GET /product.screen?product_id=AC-MOTO-HOTSPOT4G&JSESSIONID=SD8576365728$SL9394190596FF4303629878ADFF2344394698 HTTP/1.1" 200 2410 "/category.screen?category_id=BLUETOOTH" "Mozilla/5.0 (Linux; U; Android 2.3.4; en-us; DROID3 Build/5.5.1_84_D3G-55) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1" 27.175.11.11 - - [03/Apr/2020:20:30:02 +0000] "GET /static/9403.jpg?&JSESSIONID=SD7103245756$SL6669302782FF9250881909ADFF9216942956 HTTP/1.1" 200 3346 "/category.screen?category_id=BLUETOOTH" "Mozilla/5.0 (iPad; CPU OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.3" 195.69.160.22 - - [03/Apr/2020:20:30:03 +0000] "GET /category.screen?category_id=CASES&JSESSIONID=SD5212008800$SL9669846961FF8508958439ADFF3355227402 HTTP/1.1" 200 3399 "/category.screen?category_id=BATTERIES" "Mozilla/5.0 (Linux; U; Android 2.3.4; en-us; DROID3 Build/5.5.1_84_D3G-55) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1" 86.9.190.90 - - [03/Apr/2020:20:30:05 +0000] "GET /category.screen?category_id=BATTERIES&JSESSIONID=SD5330660580$SL5721426140FF1739646253ADFF1837268871 HTTP/1.1" 503 3561 "/category.screen?category_id=CASES" "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_5 like Mac OS X; en-gb) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8L1 Safari/6533.18.5"
If you’re really just interested in how many times each page, you can just summarize a count of the hits, grouped by the page URI (minus URI query strings), and filtering out images that are embedded in the pages:
/category.screen | 220 |
/product.screen | 246 |
/static/jquery.js | 38 |
/cart.do | 30 |
Say you also wanted to get a break down of the same accesses by the location of the requestor. You can use a GeoIP tool, like the MaxMind GeoIP database, to look up the locations of the requestor’s IP address, enrich the data with the result, and then summarize a count of hits grouped by requestors country:
US | 403 |
UK | 234 |
Korea (South) | 112 |
India | 213 |
Spain | 18 |
Bahamas | 20 |
By extracting the values and creating the aggregates “in the stream”, the needed metrics are readily available. As a result, you can just send the aggregated metrics to the analysis/reporting system instead of the full logs. Need the metrics data in multiple tools? The data can be delivered to each one in the format it expects, like Splunk metrics or statsd formats.
No two ways about it, logs are noisy. In the application development world, it’s far more expensive to have to go back into the code to add new elements to logging than to simply log everything up front. Unfortunately, that means you end up with a lot of info in the logs you don’t want. For example, look at the following excerpt from an AWS API Gateway log entry:
{ "resource": "/done", "path": "/done", "httpMethod": "POST", "queryStringParameters": null, "multiValueQueryStringParameters": null, "pathParameters": null, "stageVariables": null, "requestContext": { "resourcePath": "/done", "httpMethod": "POST", "identity": { "user": null }, } }
The highlighted lines show fields with null values, which provides marginal, if any, value in analysis. Removing those null value fields, reduces the data ingested into the analysis system(s). While it may not seem like much, as you scale up, it adds up very quickly. If retention has been separated from analysis, you’ll also have the freedom to cut out any fields you don’t think are valuable to your analysis, or even whole records. Of course, you’ll want to be careful, since you may find use for the removed fields later. Since you have the raw logs, though, you can always re-ingest the data.
Though there are some great use cases in here, it’s just scratching the surface. I’m sure each of you reading this have a unique need that these kinds of capabilities can help solve. Our product, Cribl LogStream provides these capabilities, and I encourage you to take a drive through our interactive sandbox environment to see how LogStream could help with those needs.
Experience a full version of Cribl Stream and Cribl Edge in the cloud with pre-made sources and destinations.
Classic choice. Sadly, our website is designed for all modern supported browsers like Edge, Chrome, Firefox, and Safari
Got one of those handy?