JavaScript, Performance

Improving performance through function caching in JavaScript

I was recently profiling a single page application using Chrome’s dev tools looking for areas of slowness. This particular app did lots of work with moment.js as it had some complex custom calendar logic. The profiling revealed that the application was spending a lot of time in moment.js, and that the calls to moment.js were coming from the same few functions.

After debugging the functions that were calling into moment.js, it became apparent that:

  • These functions were getting called a lot
  • They were frequently getting called with the same parameters

So with that in mind, we don’t really shouldn’t be asking moment.js (or any function) to do the same calculations over and over again – instead we should hold onto the results of the function calls, and store them in a cache. We can then hit our cache first before doing the calculation, which will be much cheaper than running the calculation again.

How

So, here is the function that we’re going to optimise by introducing some caching logic into.  All code in this post is written in ES5 style JavaScript and leans on underscore for some utility functions.


function calendarService() {
  function getCalendarSettings(month, year) {

    var calendar = getCurrentCalendar();

    // Just a call to underscore to do some filtering
    var year = _.findWhere(calendar.years, { year: year});

    var month = _.findWhere(year.months, { month: month});

    return month;
  }
}

The above function calls out to another function to get some calendar settings (which was itself fairly expensive) before doing some filtering on the returned object to return something useful.

Creating the cache

Firstly, we need to have a place to store our cache.  In our case, storing the results of the functions in memory was sufficient – so lets initialise an empty, service wide object to store our cached data:


function calendarService() {
  var cache = {};

  function getCalendarSettings() {

     ...

  }
}

 

Pushing items into the cache

When we add an item into the cache, we need a way of uniquely identifying it. This is called a cache key, and in our situation there will be two things that will uniquely identify an item in our cache:

  1. The name of the function that pushed the item into the cache
  2. The parameters that the function was called with

With the above in mind, let’s build a function that will generate some cache keys for us:


function getCacheKey(functionName, params) {
  var cacheKey = functionName;

  _.each(params, function(param) {

    cacheKey = cacheKey + param.toString() + '.';

  });

  return cacheKey;
}

 

The above function loops through each parameter passed in as part of the params array, and converts it to a string separated by a full stop. This will currently only work with parameters that are primitive types, but you could put your own logic into handle objects that are parameters.

So, if we were to call getCacheKey like this:


getCacheKey('getCalendarSettings', [0, 2017]);

It would return:


'getCalendarSettings.0.2017'

Which is a string, and will be used as a cache key as it uniquely identifies the function called and the parameters passed to it.

We now have our in memory cache object, and a function that will create us cache keys – so we next need to glue them together so that we can populate the cache and check the cache before running any functions. Let’s create a single function to have this job:


function getResult(functionName, params, functionToRun) {
  var cacheKey = getCacheKey(functionName, params);

  var result = cache[cacheKey];

  if(!_.isUndefined(cache[cacheKey]) {
    // Successful cache hit! Return what we've got
    return result;
  }

  result = functionToRun.apply(this, params);

  cache[cacheKey] = result;

  return result;
}

Our getResult function does the job of checking the cache, and only actually executing our function if nothing is found in the cache. If it has to execute our function, it stores the result in the cache.

It parameters are:

  • functionName – just a string which is the function name
  • params – an array of parameters that will be used to build the cache key, as well as being passed to the function that may need to be run. The order of these parameters matters and should match the order in which the function that were trying to cache consumes them
  • functionToRun – this is the actual function that needs to be run,

Our getResult function is now in place. So let’s wire up getCalendarSettings with it:


function getCalendarSettings(month, year) {
  return getResult('getCalendarSettings', [month, year], runGetCalendarSettings);

  function runGetCalendarSettings(month, year) {
    var calendar = getCurrentCalendar();

    // Just a call to underscore to do some filtering
    var year = _.findWhere(calendar.years, { year: year});

    var month = _.findWhere(year.months, { month: month});

    return month;
   }
 }

 

We’ve now updated getCalendarSettings to call getResult and instead return the result of that function. We’re also exploiting JavaScript’s variable hoisting to use the runGetCalendarSettings function before it has been declared. Our function is now fully wired up with our in memory cache, and we’ll save unnecessary computation that has already been previously completed.

Further improvements

This code could be improved upon by:

  • Only storing copies of results in the cache. If a function returns an object and that gets stored in the cache, we can risk mutating the object as we we’re storing a reference to it. This can be done using underscore’s clone function.
  • Having the code evaluate what the calling function’s name is. This would get rid of the need for the functionName parameter.
  • Storing the cache elsewhere. As it’s being held in memory, it’ll get lost on the client as soon as the site is unloaded. The only real option for this is to use local storage, but even then I’d only recommend writing and reading from local storage when the application is loaded and unloaded. If this code is being used on the server, there are a lot more options for storing the cache.

Full code listing:

 


function calendarService() {
  var cache = {};

  function getCalendarSettings(month, year) {
    return getResult('getCalendarSettings', [month, year], runGetCalendarSettings);

    function runGetCalendarSettings(month, year) {
      var calendar = getCurrentCalendar();

      // Just a call to underscore to do some filtering
      var year = _.findWhere(calendar.years, { year: year});

      var month = _.findWhere(year.months, { month: month});

      return month;
    }
  }

  function getResult(functionName, params, functionToRun) {
    var cacheKey = getCacheKey(functionName, params);

    var result = cache[cacheKey];

    if(!_.isUndefined(cache[cacheKey]) {
      // Successful cache hit! Return what we've got
      return result;
    }

    result = functionToRun.apply(this, params);

    cache[cacheKey] = result;

    return result;
  }

  function getCacheKey(functionName, params) {
    var cacheKey = functionName;
 
    _.each(params, function(param) {
      cacheKey = cacheKey + param.toString() + '.';
    });
 
    return cacheKey;
  }
}

Node.Js, Performance

Running Node.js in production using all of the cores

Did you know that the JavaScript environment is single threaded? This means that in the Node.js world, the main event loop runs on a single thread. IO operations are pushed out to their own thread, but if you are doing a CPU intensive operation on the main thread (try not to do this), you can get into problems where your server stops responding to requests.

The proper way to get around this is by programming in such a way that is considerate as to how the Node.js / Javascript runtime works.

However, you should also be running your Node.js application in production with this in mind to get the absolute best performance that you can. To get around this limitation, you need to do some form of clustering.

Production Clustering

You can run your Node.js app in production by using the “node” command, but I wouldn’t recommend it. There are several Node.js wrappers out there that will manage the running of your Node.js app in a much nicer, production grade manner.

One such wrapper is PM2.

In it’s most basic form, PM2 (which stands for Process Monitor) will keep an eye on your Node.js app for any crashes, and will attempt to restart the process should it crash. Crucially, it can also be configured to run your Node.js app in a clustered mode, which will enable us to take advantage of all of the cores that are available to us.

PM2 can be installed globally via npm:


npm install pm2 -g

Helpfully, we don’t need to change any of our application code in order to have PM2 cluster it.

How many Workers?

PM2 has an optional argument – i, which is the “number of workers” argument. You can use this argument to instruct PM2 to run your Node.js app in an explicit number of workers:


pm2 start app.js -i 4

Where 4 is the number of workers that we want to launch our app in.

However, I prefer to set the “number of workers” argument to 0, which tells PM2 to run your app in as many workers as there are CPU cores:


pm2 start app.js -i 0

Et voilla!

Links