Before reading this article, make sure you are famililar with how Node.js works. Specifically, it is important to understand how its event loop works and also how it is able to run your code on a single thread. There are some great articles that cover all of this in detail so refer to them to get the most out of this blog post.

About Node.js
Understanding the Node.js Event Loop
How the Single Threaded Non-Blocking IO Model Works in Node.js


While implementing a RESTful API for a mongodb backend using Node.js, I encountered a subtle issue with the code that would retrieve everything from the database. I was originally using the following code to retrieve data and send it to the response object:

NOTE: I am using the mongojs module

mongo.js

getAll(callback) {  
    this.db.collection(this.collection).find((err, docs) => {
        return callback(err ? err : null, err ? null : docs);
    });
}

route.js

mongoDriver.getAll((err, result) => {  
    if (err) {
        err.status = code.HTTP_INTERNAL_SERVER_ERROR;
        return next(err);
    } else if (!result) {
        const err = new Error('no more people in the database');
        err.status = code.HTTP_NO_CONTENT;
        return next(err);
    } else {
        res.status(code.HTTP_OK).json(result);
    }
});

This code is clean, easy to understand, and works just fine. However, it is not non-blocking!

In the mongo.js code, an array is returned via callback to the route (the mongojs module calls .toArray() for us behind the scenes) and then that whole array is sent to the response object in one big chunk. If you know anything about Node.js, then you know that this is less than ideal. Imagine your dataset is extrememly large? Your data may not fit in memory in one pass. Even if it does, all other requests are blocked until that array is sent to the response.

How do we fix this? Here’s the code:

mongo.js

getAll(callback) => {  
    this.db.collection(this.collection).find().forEach((err, doc) => {
        if (err) {
            return callback(err);
        } else if (!doc) {
            return callback(null, null);
        } else {
            return callback(null, doc);
        }
    });
}

route.js

mongoDriver.getAll((err, result) => {  
  if (err) {
      err.status = code.HTTP_INTERNAL_SERVER_ERROR;
      return next(err);
  } else if (!result && count === 0) {
      const err = new Error('no more people in the database');
      err.status = code.HTTP_NO_CONTENT;
      return next(err);
  } else if (!result && count > 0) {
      res.end(']');
  } else {
      if (count === 0) {
          res.set('Content-Type', 'application/json');
      }
      res.status(code.HTTP_OK);
      res.write((count++ === 0 ? '[' : ',') + JSON.stringify(result));
  }
});

You might be wondering what happened to our clean and easily readable code? Let me explain what’s going on and maybe you’ll be okay with it.

In the mongo.js code, we are returned a cursor from the database which we will then iterate over and each time we encounter a new mongo document, we will return it to the route via a callback. Still clean and simple.

However, in the route.js code things get a bit more complex. This is because we are now responsible for ensuring that what we send to the response is valid JSON. Since we are writing mongo documents to the response one by one, we need to manually create the JSON array structure around it.

Another layer of complexity arises due to the fact that we want to make sure our status codes are accurate. For example, if there are no documents in the database, we will receive null from the mongo cursor. However null is also returned after we finish receiving documents from the database. If there was never any data in the database, we want to have a difference status code than if we had data and have just finished reading that data.

Here is what this code does:

  1. If there is an error, pass that error to another route return next(err)
  2. If there was never any data in the database, send a 204 status and end the request.
  3. If there was data, manually keep a count so that we can write the array brackets at the beginning and end as will as the commas in between mongo documents (already in JSON). Note that we can only set headers once so we set the Content-Type once (when we receive the first item).

And we’re done! We now have non-blocking blocking performance with only a few extra lines of code!