Better Asynchronous JavaScript // Localytics Engineering Blog

JavaScript is single threaded. To avoid blocking applications when performing an ajax request or reading a file, these operations are usually executed asynchronously. Some mechanism is then required to keep track of what is happening.

The Problem

Most asynchronous JavaScript APIs like XMLHttpRequest expose a way to assign a function, or callback, that will be invoked when the task is completed. When multiple requests need to be executed sequentially or in parallel, our code can quickly become hard to maintain and follow.

Fortunately, there are several ways to make things more manageable. Let's take a look at the various patterns available to developers today and what we can do to make async code easier to deal with.

The Past: Callbacks

The most basic pattern for handling asynchronous operations is the callback and it is at the heart of all the methods this article will cover.

This pattern far predates JavaScript and is commonly known as continuation passing style, or CPS. An asynchronous function will take one additional argument, the continuation, and will be responsible for invoking it when the work is done. The caller will continue its own work in that callback.

To illustrate how we use callbacks, let's examine a restaurant where the staff takes orders:

var restaurant = function() {
  takeOrder(['fish', 'sandwich', 'pizza']);
}

The takeOrder function happens to be asynchronous, however, we can't cook the food until the order is finished. We need to notify the kitchen when they can start cooking, which will require a callback:

var restaurant = function() {
  takeOrder(['fish', 'sandwich', 'pizza'], function(err, order) {
    cookFood(order);
  });
}

This callback uses the standard popularized by Node.js. It takes 2 arguments. The first is an error, which is either null when execution is successful or contains an error object on failure. The second is the function's result. Callbacks don't require this to function, but a standard is necessary to keep APIs consistent. Read The Node Way. Using this standard will also help us convert our code with more advanced techniques later in this article.

The cookFood function is also asynchronous, but customers can't start eating their food until it's finished cooking. We'll need another callback for the kitchen to notify when cookFood is done. Since the order contained an array of food items, we'll return the meals as an array too:

var restaurant = function() {
  takeOrder(['fish', 'sandwich', 'pizza'], function(err, order) {
    cookFood(order, function(err, meals) {
      eat(meals[0]);
      eat(meals[1]);
      eat(meals[2]);
    });
  });
};

Here's where things become more complicated. The eat function is also asynchronous. Customers all start eating at the same time, there is no way to know when they will finish. We want to bring the check, but we have to wait until all three customers are done. We need some to keep track of this.

To solve this simply we'll use a variable to count down as customer's finish eating (note that we have to handle the results individually each time the callback is invoked).

var restaurant = function(complete) {
  takeOrder(['fish', 'sandwich', 'pizza'], function(err, order) {
    cookFood(order, function(err, meals) {
      var counter = 3;
      var payment = 0;
      var waitForCustomer = function(err, money) {
        counter--;
        payment += money;
        if(counter < 1) {
          complete(null, payment);
        }
      };

      eat(meals[0], waitForCustomer);
      eat(meals[1], waitForCustomer);
      eat(meals[2], waitForCustomer);
    });
  });
};

To further complicate things, our restaurant's kitchen isn't very reliable. While any part of our code could result in an error, we want to specifically report problems in getting food to the customers. We'll check in cookFood's callback if err contains a value, and if it does, we'll stop there:

var restaurant = function(complete) {
  takeOrder(['fish', 'sandwich', 'pizza'], function(err, order) {
    cookFood(order, function(err, meals) {
      if(err) {
        console.log(err.message);
        complete(err);
      } else {
        var counter = 3;
        var payment = 0;
        var waitForCustomer = function(err, money) {
          counter--;
          payment += money;
          if(counter < 1) {
            complete(null, payment);
          }
        };

        eat(meals[0], waitForCustomer);
        eat(meals[1], waitForCustomer);
        eat(meals[2], waitForCustomer);
      }
    });
  });
};

There are a few problems with the callback method which leads us to look for better alternatives:

The caller is responsible for passing in a callback. If it doesn't, the operation will end without giving a chance to the caller to handle the result or any execution error. There is no official standard for designing an API that accepts callbacks.
Poor chaining options. Every time we add a new operation to our restaurant, we have to add a new callback. Doing so will either increase the level of nesting in our code or make it jump between functions that could be defined anywhere.

This issue has led the community to dub the situation where an application has too many callbacks as the Callback Hell.

Handling parallel operations is difficult. We need some way to know when an arbitrary amount of callbacks have completed. For our restaurant we used a naive approach: we know there will be 3 callbacks, so we start a counter at 3, and decrement until we're done. This wouldn't scale to more complicated scenarios where operations could fork or join an arbitrary amount of times.

Even with all these problems, it is important to understand how callback works as we will keep building on top of them to make our lives easier.

There are libraries to help ease the pain of working with callbacks. A notable example is the fantastic async.js, which would help eliminate nesting and can handle the parallel scenario. It also has features to handle more advanced use cases such as throttling (what if only two customers could eat at a time?)

The Present: Promises

If you target only modern browsers and newer versions of node/io.js, or are willing to use third party libraries, a better option becomes available: promises. Promises attempt to be the solution to callbacks shortcoming. They serve as a wrapper around asynchronous operations make them chainable, composable and allow consistent error handling.

You may have heard of promises referred to as tasks or futures, though depending on the language these may have different semantics than JavaScript promises. Promises are also related to the Continuation monad. See this discussion for some background related to monads and promises

The core idea behind this pattern is that a function responsible for asynchronous work returns a promise. The promise object has a standard interface, the most important part being then .then method. .then takes a callback as its first argument. If the asynchronous work has already been completed, then callback will be invoked right away, even if it is assigned long after. If the work has not been completed, the callback will be invoked when it will be.

Objects with a .then method are known as .thenable. Not all .thenable are promises but are often still compatible. See the Promises/A+ for more details.

There are many benefits that come with promises:

Calling .then with a callback returns a thenable too, so they can be chained indefinitely.
Promise objects are immutable, can safely be passed around an application and consumers can also call .then to add to the chain.
The return value of a .then callback is passed as argument to the next part of the chain.
Error handling is standardized via .then's second argument (an error callback). Many implementations also support .catch, .finally and other similar constructs to allow writing code in a try/catch style and make things more natural.

The promise API can be seen as a standard, generic and more scalable way to work with callbacks. The callbacks are still there, but promises force us to keep them clean.

Promises are part of the EcmaScript 6. They're available out of the box on newer browsers and version of nodes. Libraries such as Q, BlueBird and Angular allow us to use promises in older environment, or with additional features.

Let's revisit our restaurant, now with promises. First, we take the order. Notice that this time, takeOrder has a return value (the promise), so we'll return it to the caller:

var restaurant = function() {
  return takeOrder(['fish', 'sandwich', 'pizza'])
};

To pass the order to the kitchen, we chain .then with a callback that takes the order as argument. We return the result of cookFood (also a promise) to allow further chaining.

var restaurant = function() {
  return takeOrder(['fish', 'sandwich', 'pizza'])
    .then(function(order) {
      return cookFood(order);
    });
};

If we stopped here, the application calling restaurant() would also be to keep the chain going .then, giving it access to the result of cookFood. For example, it would be possible to do

restaurant().then(function(meals){
  console.log(meals); 
});

We'd rather have our customers enjoy their meal though, so we'll add them to the chain instead.

var restaurant = function() {
  return takeOrder(['fish', 'sandwich', 'pizza'])
    .then(function(order) {
      return cookFood(order);
    })
    .then(function(meals) {
      eat(meals[0]);
      eat(meals[1]);
      eat(meals[2]);
    });
};

If eat also returns a promise, we run into a problem: to keep the chain going, we need to return the result of eat, but we can only return one value. The promise api has an .all method which allows us to combine multiple promises into one by passing it an array. The next step of the chain will receive a result array, allowing us to handle all the values at the same time.

var restaurant = function() {
  return takeOrder(['fish', 'sandwich', 'pizza'])
    .then(function(order) {
      return cookFood(order);
    })
    .then(function(meals) {
      return Promise.all([eat(meals[0]), eat(meals[1]), eat(meals[2])]);
    })
    .then(function(monies) {
      var total = 0;
      monies.forEach(function(r){ total += r; });
      return total;
    });
};

To take care of our kitchen's high failure rate, we can pass a second argument to .then, which will be only invoked if there is an error. Returning a rejected promise via Promise.reject will break the chain. One thing to note is that unhandled errors (ie: if takeOrder was to throw an exception) will be lost. Promise libraries each have their ways to handle unhandled exceptions, such as logging them to the console or allowing developers to specific a catch-all function.

var restaurant = function() {
  return takeOrder(['fish', 'sandwich', 'pizza'])
    .then(function(order) {
      return cookFood(order);
    })
    .then(function(meals) {
      return Promise.all([eat(meals[0]), eat(meals[1]), eat(meals[2])]);
    }, function(err) {
      console.log(err.message);
      return Promise.reject(err);
    })
    .then(function(monies) {
      var total = 0;
      monies.forEach(function(r){ total += r; });
      return total;
    });
};

Let's reiterate how our restaurant was improved:

Adding an arbitrary amount of additional operations to the restaurant wouldn't require changing the structure of the code
There is a consistant way to add error handling at any point of the chain.
The chain can be broken by returning a rejected promise
Promise.all gives us a consistant way of handle any number of parallel tasks
The caller of the restaurant function can leverage the .then API to keep the chain going.

Libraries like Q and Bluebird let us convert node-style asynchronous functions into functions that return promises. Given a function which takes a callback in the form function(err, result) {...} as its last arguments, these utilities will create a new function that returns a promise. This is one of the big benefits of sticking to the node standard when using the callback style.

Toward the Future with Generators

Promises make things a lot easier, but the code is still significantly harder to read than synchronous code, and error handling still involves callbacks and workarounds as opposed to leveraging native language constructs like try/catch

To do better, we need enhancements in the language itself. We'll need generators. A generator function is a special kind of function that enables control flow with yield expressions. Whenever the function is invoked, it starts in a paused state until told to execute until it encounters a yield, at which point its execution will stop, and control will return to the caller. A generator function returns an iterator object that can be used to to communicate with it.

Generators requires EcmaScript 6 (sometimes called EcmaScript 2015) support and are only recently supported environments such as io.js To use ES6 across a variety of browsers or in node.js without feature flags you can use Babeljs, a fantastic ES6-to- ES5 compiler which includes Regenerator. Bonus points: it supports JSX/React too! Another alternative is Traceur from Google.

Take the following generator function.

function* foo() {
  console.log('execution starts');
  yield 1;
  yield 2;
  yield 3;
  console.log('execution completes');
}

The star * is used to define a function as a generator. You can only use the yield keyword in such a "star" function.

Let's break down what happens when using the function foo:

We execute the code var fooIterator = foo();. Nothing happens inside of foo
fooIterator is now an iterator which we can use to control foo's execution.
Invoking fooIterator.next() executes foo until the first yield statement and return 1
The next time we call .next(), the second yield will execute and return 2 and so on until the function has run to completion.

The generator function could also call yield in a loop, while reading some database output or something similar as convenient sugar to create an interator that over the data. It can of course contain regular code (such as the console.log statements in the example above) that will be executed normally until the next yield or the end of the function.

The concept of a function that stops its execution and yields to its caller, until it is told to resume is also referred to as a coroutine.

The ability to stop execution and come back as necessary is exactly the language construct we need to abstract away our callbacks altogether when combined with promises: yield a promise whenever we want our code to stop as it would if we were writing synchronous code, and in the promises' .then callback, call .next on the iterator to resume.

var myFoo;

function *foo() {
  console.log("before the asynchronous task");
  yield workThatReturnsAPromise().then(function() {
    myFoo.next();  
  });
  console.log("after the asynchronous task has completed");
}

myFoo = foo();
myFoo.next();

The code above isn't very generic: we need to keep track of iterators and add a .then to the promise to use resume execution. This boilerplate adds up.

A generic wrapper can be made with the following properties:

It should hold onto the iterator returned by the generator function,
It should automatically call .next() on the iterator to start execution
It should pick up any promise we yield in our code
When a promise is yielded, it should call .then on it with a callback that will resume execution of the generator function once the promise resolves.

Q and Bluebird have already done this for us with Q.async and Promise.coroutine, respectively.

Generator functions also helps handling errors: We can use fooIterator.throw('boom!') to propagate the error up to the caller like a synchronous exception. This saves us from needing error callback and gives us the ability to use JavaScript's try/catch blocks to handle exceptions.

Moving Forward: Yielding Promises

Let's see how we could leverage Bluebird's .coroutine method to simplify our restaurant.

For ease of use, we'll start by assigning the coroutine function to an alias, async. We'll then make our kitchen function a generator with the * and start taking orders.

var async = Bluebird.coroutine;
var restaurant = async(function* () {
  takeOrder(['fish', 'sandwich', 'pizza']);
});

To retrieve the value of the asynchronous task, all we need to do is yield the promise. The async helper will take care of correctly resuming executing and returning the result of the task.

var async = Bluebird.coroutine;
var restaurant = async(function* () {
  var order = yield takeOrder(['fish', 'sandwich', 'pizza']);
  cookFood(order);
});

Chaining promises (remember: under the hood, these are still the same methods as in the promise example) looks like synchronous code, except for needing to yield whenever we need to use the result of asynchronous tasks.

var async = Bluebird.coroutine;
var restaurant = async(function* () {
  var order = yield takeOrder(['fish', 'sandwich', 'pizza']);
  var meals = yield cookFood(order);
  eat(meals[0]);
  eat(meals[1]);
  eat(meals[2]);
});

Again, it's all just promises. To wait on multiple parallel tasks, use Promise.all. The Bluebird wrapper used in this example only handles promises by default, so even primitive values such as the total need to be wrapped in a promise. We can use Promise.resolve to do so.

var async = Bluebird.coroutine;
var restaurant = async(function* () {
  var order = yield takeOrder(['fish', 'sandwich', 'pizza']);
  var meals = yield cookFood(order);
  var total = 0;
  var monies = yield Promise.all([eat(meals[0]), eat(meals[1]), eat(meals[2])]);
  monies.forEach(function(r){ total += r; });
  yield Promise.resolve(total);
});

Bluebird's .coroutine only works with promises by defaults, so we have to remember to wrap primitive values in promises. If you want things to be even more seamless you can use yield handlers, to automate the process. The code is still quite readable without them, though.

Error handling is a lot more intuitive: the async wrapper will bubble up exceptions using the iterator's .throw method, giving us the ability to handle them synchronously with try/catch.

var async = Bluebird.coroutine;
var restaurant = async(function* () {
  var order = yield takeOrder(['fish', 'sandwich', 'pizza']);
  try {
    var meals = yield cookFood(order);
    var total = 0;
    var monies = yield Promise.all([eat(meals[0]), eat(meals[1]), eat(meals[2])]);
    monies.forEach(function(r){ total += r; });
    yield Promise.resolve(total);
  } catch(err) {
    console.log(err.message);
  }
});

We gained a lot by adding generators to the mix:

No more callbacks, not even the promises' .then
Error handling the same way we do it with synchronous code
Aside for the async wrapper and yield statements, the code LOOKS synchronous
We still have access to the Promise api

By introducing generators, we cut down our code by roughly 30% and made it far more readable by eliminating all the callbacks and replacing them with native JavaScript constructs. The only thing left is getting rid of the wrapper making the role of the yield keyword more intuitive

To infinity, and beyond!: async and await

Using generators to yield promises has so many benefits, many people think it should be part of the language. Similar concepts have already been introduced in other languages such as C#

For JavaScript, there is the async functions proposal for ES7 (yes, ES7, not ES6, even though the later is still only a stage 1 proposal as of this writing). Async functions take the patterns we used with generators/promises and make them part of the language as keywords.

If you use Babeljs and enable experimental features with the --stage 1 switch, you can try them today.

Let's rewrite our restaurant one more time with async functions:

var restaurant = async function() {
  var order = await takeOrder(['fish', 'sandwich', 'pizza']);
  try {
    var meals = await cookFood(order);
    var total = 0;
    var monies = await Promise.all([eat(meals[0]), eat(meals[1]), eat(meals[2])]);
    monies.forEach(function(r){ total += r; });
    return total;
  } catch(err) {
    console.log(err.message);
  }
};

This isn't much different from using yield or generator methods.

We no longer need the async wrapper. Instead we prefix the function with async
Instead of using yield, we use the slightly more intuitive await keyword (which does the same thing in this context)
We can return primitive values without wrapping them in promises

This syntax is just an early proposal and is subject to change. The proposal is fairly popular, and if you sit in chatrooms of bleeding edge projects like Babel or Aurelia, you'll hear from people who swear by it. We still don't know what form it will take when it gets officially accepted into the language or even if it will be at all.

There's also the issue of tooling. JShint, one of the more popular JavaScript linter, handles a large portion of ES6, but doesn't support async/await. ESlint has it on their list, but it's low priority. Other tools and IDEs that support ES6 may also fail to parse it. There are various forks, plugins, and hacks to get it to work. It's up to you to decide if it's worth the trouble to have a bit of extra syntax sugar over the Q/Bluebird wrappers.

A note on CoffeeScript

As of 1.9, CoffeeScript supports the yield keyword and generator functions. In CoffeeScript, the * isn't necessary and any functions containing a yield is automatically compiled to an ES6 generator function. Unlike Babel, however, CoffeeScript doesn't have an option to convert generator functions to ES5 with Regenerator, so you'll have to do it yourself, by running it after the CoffeeScript compile step, unless you're targeting a platform that supports generators natively such as io.js.

With generator functions you would then be free to use Q.async and Bluebird's .coroutine to simplify your asynchronous code. If you want something closer to the real async/await syntax you can look at more exotic flavors of the language, such as Iced CoffeeScript, or push for support in CoffeeScript itself

Wrapping it all up

Everything shown here is available today. Which one you can use depends on your target environment and what compromises you are willing to make.

Callbacks are always an option regardless of browsers, environments and libraries. Using libraries like async.js can help keeping them manageable without introducing more advanced concepts.
Promises are only native to the newest browser and node versions, but have robust library implementations, including in frameworks such as AngularJS
Generators are available node.js (behind feature flag) and iojs, but otherwise require a compiler.
Async functions are still only in the early stages of being accepted into the ES7 standards. You'll need to use a JavaScript compiler to use them at all.

With all these options available now and in the future, there's no reason to fall in callback hell. Keep it clean, and enjoy all the benefits of asynchronous JavaScript!

April 30, 2015

Better Asynchronous JavaScript

By Francois Ward