Promise chaining memory leak

Last edited: August 11, 2019

In this post we’ll walk through a memory leak pattern we recently encountered when using Javascript Promises. For those unfamiliar, a Promise is a handle to a value that is generally computed asynchronously. One of the most useful features of Promises is that they can be chained to express a series of asynchronous operations to be performed in some order, for example

Code example
asyncFetch(url) .then(content => findLinks(content)) .then(links => insertIntoDb(links)) ...

The problem

Cribl LogStream is a stream processing engine for machine data, as such it is responsible for receiving, processing and sending data to destination systems. Communication failures with destination systems is something that need to be handled gracefully, a few reasons why downstream systems can be unavailable include: network issues, system upgrades or outages etc. There are a few ways we can deal with downstream systems not being available: stop processing incoming data (ie backpressure), drop data being sent to the unavailable system or queue it for later delivery. By default Cribl LogStream chooses to backpressure, while constantly retrying to send data to the downstream system(s) – and this is where the fun begins:

The code for retrying sends looked something like this (greatly simplified for readability):

Code example
function sendWithRetry(data, dest) { return dest.send(data) .catch(error => delayMs(100).then(() => sendWithRetry(data, dest))); // retry send after 100ms }

If the destination is available  dest.send(data) would resolve, otherwise it would reject and the code above would retry the send 100ms later, ad infinitum. We want the Promise returned by sendWithRetry()to resolve only once the data has been successfully sent to the destination (so that we can stop reading and thus apply backpressure). In the pattern above we’re chaining Promises in the case of error, which would eventually resolve once destination becomes available.  However, if dest.send() rejects for a prolonged time, the above code can result in infinitely long Promise chains and since none of the Promises in the chain can be garbage collected until the chain resolves this situation ultimately results in RAM exhaustion. In our case we had 1000s of data streams attempting to be delivered to an unavailable system – ie 1000s of such Promise chains extending infinitely resulting in a process crash due to GC not being able to free up enough heap space.

Solution

The solution is to break the Promise chain and it looks like this:

Code example
function sendWithRetry(data, dest) { return dest.send(data) .catch(error => new Promise(resolve => { function retryIt() { delayMs(100).then(() => dest.send(data)).then(() => resolve()) .catch(err => retryIt()); // DO NOT return a Promise here! } retryIt(); })); }

Since retryIt()does not return a Promise the chain is never extended. In the reject condition a new Promise chain is created, which would either resolve or be thrown away and retried. The above solution also satisfies the requirement for sendWithRetry()to resolve only once the data has been successfully sent to the destination.

How/why does it work?

Let’s assume that the destination is down for a prolonged period of time, in which case the first try to send will fail and we’ll enter the retry logic. Let’s see how that works by breaking it down into two parts:

1. delayMs(100).then(() => dest.send(data)).then(() => resolve())

this code tries to send again after 100ms of delay, and on success it will resolve the Promise

2. catch(err => retryIt());

this code will call the function itself again, remember that we’ve delayed the execution by at least 100ms

Now, since we’re not returning anything from the retryIt() and not chaining with previous calls, any previous invocations of the function can be cleaned up – ie we are not creating a chain and thus no real recursion for that matter. Another way to think of it would be imagine every time send fails we schedule a timeout to retry, while on success we don’t

Another way to express the above using async/awaitwould be (thx helloimsomeone):

Code example
async function sendWithRetry(data, dest) { for (;;) { try { await dest.send(data); break; } catch { await delayMs(100); } } }

The lesson here is to be on the look out for infinitely chaining of Promises and paying special attention to recursive functions that return Promises.

If you’ve enjoyed reading this and are looking to join a kick ass engineering team drop us a line at hello@cribl.io – we’re hiring!

Cribl, the Data Engine for IT and Security, empowers organizations to transform their data strategy. Customers use Cribl’s suite of products to collect, process, route, and analyze all IT and security data, delivering the flexibility, choice, and control required to adapt to their ever-changing needs.

We offer free training, certifications, and a free tier across our products. Our community Slack features Cribl engineers, partners, and customers who can answer your questions as you get started and continue to build and evolve. We also offer a variety of hands-on Sandboxes for those interested in how companies globally leverage our products for their data challenges.

More from the blog

get started

Choose how to get started

See

Cribl

See demos by use case, by yourself or with one of our team.

Try

Cribl

Get hands-on with a Sandbox or guided Cloud Trial.

Free

Cribl

Process up to 1TB/day, no license required.