React

Adding React to an existing web application and build pipeline

In order to brush up on my React knowledge, in this post I’m going to outline how I added it into the admin tools of links.edspencer.me.uk.

This walk through will have an emphasis on not just being a hello world application, but will also focus on integrating everything into your existing build pipeline – so you’ll actually be able to run your real application in a real environment that isn’t your localhost.

I’m writing this up because I felt the setup of React was a bit confusing. You don’t just add a reference to a minified file and off you go – the React setup is a lot more sophisticated and is so broken down and modular that you can swap out just about any tool for another tool. You need a guide just to get started.

We’ll be going along the path of least resistance by using the toolchain set that most React developers seem to be using – Webpack for building, and Babel for transpiling.

The current application

The current application works on purely server side technology – there is no front end framework currently being used. The server runs using:

  • NodeJs
  • ExpressJs
  • The Pug view engine

The build pipeline consists of the following:

  • A docker image built on top of the node 6.9.4 image, with some additionally installed global packages
  • An npm install
  • A bower install for the front end packages
  • A gulp task build (to compile the SCSS, and minify and combine the JS and CSS)

Step 1 – moving to Yarn

You can get your React dependencies through npm, but I figured that moving to Yarn form my dependency management was a good move. Aside from it being suggested in most tutorials (which isn’t a good enough reason alone to move):

  1. Bower is not deprecated , but the Bower devs recommend moving to Yarn
  2. The current LTS version of Node (8.9.4) ships with npm 5, which has some checksum problems that might cause your builds to fail
  3. Yarn is included in the Node 8.9.4 docker image

Hello node 8.9.4

While we’re here we may as well update the version of node that the app runs under to 8.9.4, which is the current LTS version.

In dockerized applications, this is as easy as changing a single line of code in your docker image:

FROM node:6.9.4

Becomes:

FROM node:8.9.4

Goodbye Bower

Removing Bower was easy enough. I just went through the bower.json file, and ran yarn add for each item in there. This added the dependencies into the package.json file.

The next step was then to update my gulp build tasks to pull the front end dependencies out of the node_modules folder instead of the bower_components folder.

I then updated my build pipeline to not run bower install or npm install, and to instead run yarn install, and deleted the bower.json file. In my case, this was just some minor tweaks to my dockerfile.

Build pipleline updates

The next thing to do was to remove any calls to npm install from the build, and to instead call yarn install.

Yarn follows npm’s package file name and folder install conventions, so this was a very smooth change.

Step 2 – Bringing in Webpack

You have to use a bundler to enable you to write modular React code. Most people tend to use Webpack for this, so that’s the way we’re going to go. You can actually use Webpack to do all of your front end building (bundling, minifying, etc), but I’m not going to do this.

I have a working set of gulp jobs to do my front end building, so we’re going to integrate Webpack and give it one job only – bundling the modules.

Firstly, let’s add the dependency to webpack.

yarn add webpack --dev

Now we need to add an empty configuration file for Webpack:

touch webpack.config.js

We’ll fill out this file later.

Lastly, lets add a yarn task to actually run webpack, which we’ll need to run when we make any React changes. Add this to the scripts section of your package.json file:

{
  "scripts": {
    "build-react": "webpack"
  }
}

You may be thinking that we could just run the webpack command directly, but that will push you down the path of global installations of packages. Yarn steers you away from doing this, so by having the script in your package.json, you know that your script is running within the context of your packages available in your node_modules folder.

Step 3 – Bringing in Babel

Babel is a Javascript transpiler that will let us write some ES6 goodness without worrying too much about the browser support. Babel will dumb our javascript code down into a more browser ready ES5 flavour.

Here’s the thing – every tutorial I’ve seen involves installing these 4 babel packages as a minimum. This must be because Babel has been broken down into many smaller packages, and I do wonder if this was a bit excessive or not:

yarn add babel-core babel-loader babel-preset-es2015 babel-preset-react

Once the above babel packages have been installed, we need to wire it up with webpack, so that webpack knows that it needs to run our react specific javascript and jsx files through Babel.

Update your webpack.config.js to include the following.

module.exports = {
 module: {
   loaders: [
     { test: /\.js$/, loader: 'babel-loader', exclude: /node_modules/ },
     { test: /\.jsx$/, loader: 'babel-loader', exclude: /node_modules/ }
   ]
 }
}

Note that the webpack.config.js file is not yet complete – we’ll be adding more to this shortly.

Step 4 – Bringing in React

We’re nearly ready to write some React. We just need to add a few more packages first, and that’ll be it, I promise.

yarn add react react-dom

This will add the core of React, and another React module that will allow us to do some DOM manipulation for our first bit of react coding. This does feel a bit daft to me. Webpack is sophisticated enough to run through our modules and output exactly what is needed into a single JS file. Why, therefore, do we need to break the install of a server side package down so much, if Webpack is going to grab what we need?

Step 5 – Actually writing some React code

We’re now at the point where we can write some react. So in my application, I’m going to have a small SPA (small is how SPAs should be – if you need to go big, build a hybrid) that will be the admin tool of my web application.

So, in the root of our web app, let’s add a folder named client, and in this folder, let’s add the following two files:

client/admin.js

import React from 'react';
import ReactDOM from 'react-dom';
import App from './admin-app.jsx';

ReactDOM.render(<App />, document.getElementById('react-app'));

client/admin-app.jsx

import React from 'react';

export default class App extends React.Component {
 render() {
 return (
 <div style={{textAlign: 'center'}}>
   <h1>Hello World - this is rendered by react</h1>
 <div>);
 }
}

The above two files aren’t doing a lot. The JSX file is declaring our “App” class, and the JS is telling react DOM to push our app JSX into a html element with the id “react-app”.

Updating our webpack config

Now we should complete our webpack config to reflect the location of our React files. Update webpack.config.js so that the entire file looks like this:

const path = require('path');
module.exports = {
  entry:'./client/admin.js',
  output: {
    path:path.resolve('dist'),
    filename:'admin_bundle.js'
  },
  module: {
    loaders: [
      { test: /\.js$/, loader: 'babel-loader', exclude: /node_modules/ },
      { test: /\.jsx$/, loader: 'babel-loader', exclude: /node_modules/ }
    ]
  }
}

We’re telling webpack where then entry point of our React application is, and where to output the bundled code. (dist/admin_bundle.js).

Note we’re also leaning on a new package (path) to help us with some directory walking, go ahead and add that using yarn add path.

Now, let’s go ahead and ask webpack to bundle our React app:


yarn run build-react

Now if everything as worked as expected, webpack would have generated a bundle for us in dist/admin_bundle.js. Go ahead and take a peek at this file – it contains our code as well as all of the various library code from React that is needed to actually run our application.

Step 6 – Plugging the bundled react code into our application

As this SPA is only going to run in the admin section, we need to add two things to the main page in the admin tool that we want this to run on.

In my case, I’m using the awesome pug view engine to render my server side html. We need to add a reference to our bundled Javascript, and a div with the id “react-app”, which is what we coded our react app to look for.

Here’s a snippet of the updated admin pug view:

div.container
  div.row
    div.col-xs-12
      h1 Welcome to the admin site. This is rendered by node
  #react-app

script(type="text/javascript", src="/dist/admin_bundle.js")

Alternatively, here’s how it looks in plain old html:

<div class="container">
  <div class="row">
    <div class="col-xs-12">
        <h1>Welcome to the admin site. This is rendered by node</h1>
    </div>
  </div>
  <div id="react-app"></div>
</div>

<script type="text/javascript" src="/dist/admin_bundle.js">

Now all we need to do is to run our application and see if it works:

Et voila!

Step 8 – A little more build cleanup

As previously stated, I’ve got an existing set of gulp tasks for building my application, that I know work and that I’d like to not have to re-write.

We’ve kept out webpack build separate, so let’s now tie it up with our existing gulp task.

Previously, I would run a gulp task that I had written that would bundle and minify all Js and SCSS:

gulp build

But now we also need to run webpack. So, let’s update our package.json file to include some more custom scripts:

"scripts": {
  "build-react": "webpack",
  "build": "gulp build && webpack"
}

Now, we can simply run:

yarn run build

Now we just need to update our build process to run the above yarn command instead of gulp build. In my case, this was a simple update to a dockerfile.

Conclusion

The above setup has got us to a place where we can actually start learning and writing some React code.

This setup is a lot more involved than with other frameworks I’ve worked  with. This is because the React is a lot less opinionated on how you do things, and the tooling that you use. The disadvantage to this is that it makes the setup a bit trickier, but the advantage is that you have more freedom elsewhere in the stack.

I do however think that there may be a bit too much freedom, and a little more convention and opinion wouldn’t hurt. For example some sensible install defaults that could be changed later would cover this off. Whilst there may be some yeoman generators out there, they wont help with integration into existing web applications.

What is interesting is noting the differences between how you build your React applications to how you build your AngularJs applications. With AngularJs, you reference one lib and off you go. You may only use a small part of AngularJs, but you’ve referenced the whole thing, whereas with React, and Angular, you have a build process which outputs you a file containing exactly what you need.

I think this overall is a good thing, but we should still put this into context and be careful to not quibble over kilobytes.

We also saw how we can integrate the React toolchain into our existing toolchain without throwing everything out.

Next up, I’ll be writing a blog post on what you should do in order to run React in production.

DevOps

Lessons learned from a server outage

At just gone midnight this morning when I was nicely tucked up in bed, Uptime Robot sent me an email telling me that this very website, was not reachable.

I was fast asleep, and my phone was on do not disturb, so that situation stayed as it was until I woke up.

After waking up and seeing the message, I assumed that the docker container running the UI bit of this website had stopped running. Easy, I thought, I’ll just start up the container again, or I’ll just trigger another deployment and let my awesome GitLab CI pipeline do it’s thing.

Except this wasn’t the case. After realising, in a mild panic, that I could not even SSH onto the server that hosts this site, I got into contact with the hosting company (OneProvider) for some support.

I sent off a support message and sat back in my chair and reflected for a minute. Had I been a fool for rejecting the cloud? If this website was being hosted in a cloud service somewhere, would this have happened? Maybe I was stupid to dare to run my own server.

But as I gathered my thoughts, I calmed down and told myself I was being unreasonable. Over the last few years, I’ve seen cloud based solutions in various formats, also fail. One of the worst I experienced was with Amazon Redshift, when Amazon changed an obscure setup requirement meaning that you essentially needed some other Amazon service in order to be able to use Redshift. I’ve also been using a paid BitBucket service for a while with a client, which has suffered from around 5 outages of varying severity in the last 12 months. In a weird co-incidence, one of the worst outages was today. In comparison, my self hosted GitlLab instance has never gone down outside of running updates in the year and a half that I’ve had it.

Cloud based or on my own server, if an application went down I would still go through the same support process:

  • Do some investigation and try to fix
  • Send off a support request if I determined the problem was to do with the service or hosting provider

Check your SLA

An SLA or a Service Level Agreement basically outlines a service provider’s responsibilities for you. OneProvider’s SLA states that they will aim to resolve network issues within an hour, and hardware issues within two hours for their Paris data centre. Incidentally, other data centres don’t have any agreed time because of their location – like the Cairo datacenter. If they miss these SLAs, they have self imposed compensation penalties.

Two hours had long gone, so whatever the problem was, I was going to get some money back.

Getting back online

I have two main services running off of this box:

I could live with the link archive going offline for a few hours, but I really didn’t want my website going down. It has content that I’ve been putting on here for years, and believe it or not, it makes me a little beer money each month though some carefully selected and relevant affiliate links.

Here’s where docker helped. I got the site back online pretty quickly simply by starting the containers up on one of my other servers. Luckily the nginx instance on that webserver was already configured to handle any requests for edspencer.me.uk, so once the containers were started, I just needed to update the A record to point at the other server (short TTL FTW).

Within about 20 minutes, I got another email from Uptime Robot telling me that my website was back online, and that it had been down for 9 hours (yikes).

Check your backups

I use a WordPress plugin (updraft) to automate backups of this website. It works great, but the only problem is that I had set this up to take a backup weekly. Frustratingly, the last backup cycle had happened before I had penned two of my most recent posts.

I started to panic again a little. What if the hard drive in my server had died and I had lost that data forever? I was a fool for not reviewing my backup policy more. What if I lost all of the data in my link archive? I was angry at myself.

Back from the dead

At about 2pm, I got an email response from OneProvider. This was about 4 hours after I created the support ticket.

The email stated that a router in the Paris data centre that this server lives in, had died, and that they were replacing it and it would be done in 30 minutes.

This raises some questions about OneProvider’s ability to provide hosting.

  • Was the problem  highlighted by their monitoring, or did they need me to point it out?
  • Why did it take nearly 14 hours from the problem occurring to it being resolved?

I’ll be keeping my eyes on their service for the next two months.

Sure enough, the server was back and available, so I switched the DNS records back over to to point to this server.

Broken backups

Now it was time to sort out all of those backups. I logged into all of my servers and did a quick audit of my backups. It turns out there were numerous problems.

Firstly, my GitLab instance was being backup up, but the backup files were not being shipped off of the server. Casting my memory back, I can blame this on my Raspberry Pi that corrupted itself a few months ago that was previously responsible for pulling backups into my home network. I’ve now setup Rsync to pull the backups onto my RAIDed NAS.

Secondly, as previously mentioned, my WordPress backups were happening too infrequently. This has now been changed to a daily backup, and Updraft is shunting the backup files into Dropbox.

Thirdly – my link archive backups. These weren’t even running! The database is now backed up daily using Postgres’ awesome pg_dump feature in a cron job. These backups are then also pulled using Rsync down to my NAS.

It didn’t take long to audit and fix the backups – it’s something I should have done a while ago.

Lessons Learned

  • Build some resilience in. Deploy your applications to multiple servers as part of your CI pipeline. You could even have a load balancer do the legwork of directing your traffic to servers that it knows are responding correctly.
  • Those servers should be in different data centres, possibly with different hosting providers.
  • Make sure your backups are working. You should check these regularly or even setup an alerting system for when they fail.
  • Make sure your backups are sensible. Should they be running more frequently?
  • Pull your backups onto different servers so that you can get at them if the server goes offline
JavaScript, Performance

Improving performance through function caching in JavaScript

I was recently profiling a single page application using Chrome’s dev tools looking for areas of slowness. This particular app did lots of work with moment.js as it had some complex custom calendar logic. The profiling revealed that the application was spending a lot of time in moment.js, and that the calls to moment.js were coming from the same few functions.

After debugging the functions that were calling into moment.js, it became apparent that:

  • These functions were getting called a lot
  • They were frequently getting called with the same parameters

So with that in mind, we don’t really shouldn’t be asking moment.js (or any function) to do the same calculations over and over again – instead we should hold onto the results of the function calls, and store them in a cache. We can then hit our cache first before doing the calculation, which will be much cheaper than running the calculation again.

How

So, here is the function that we’re going to optimise by introducing some caching logic into.  All code in this post is written in ES5 style JavaScript and leans on underscore for some utility functions.


function calendarService() {
  function getCalendarSettings(month, year) {

    var calendar = getCurrentCalendar();

    // Just a call to underscore to do some filtering
    var year = _.findWhere(calendar.years, { year: year});

    var month = _.findWhere(year.months, { month: month});

    return month;
  }
}

The above function calls out to another function to get some calendar settings (which was itself fairly expensive) before doing some filtering on the returned object to return something useful.

Creating the cache

Firstly, we need to have a place to store our cache.  In our case, storing the results of the functions in memory was sufficient – so lets initialise an empty, service wide object to store our cached data:


function calendarService() {
  var cache = {};

  function getCalendarSettings() {

     ...

  }
}

 

Pushing items into the cache

When we add an item into the cache, we need a way of uniquely identifying it. This is called a cache key, and in our situation there will be two things that will uniquely identify an item in our cache:

  1. The name of the function that pushed the item into the cache
  2. The parameters that the function was called with

With the above in mind, let’s build a function that will generate some cache keys for us:


function getCacheKey(functionName, params) {
  var cacheKey = functionName;

  _.each(params, function(param) {

    cacheKey = cacheKey + param.toString() + '.';

  });

  return cacheKey;
}

 

The above function loops through each parameter passed in as part of the params array, and converts it to a string separated by a full stop. This will currently only work with parameters that are primitive types, but you could put your own logic into handle objects that are parameters.

So, if we were to call getCacheKey like this:


getCacheKey('getCalendarSettings', [0, 2017]);

It would return:


'getCalendarSettings.0.2017'

Which is a string, and will be used as a cache key as it uniquely identifies the function called and the parameters passed to it.

We now have our in memory cache object, and a function that will create us cache keys – so we next need to glue them together so that we can populate the cache and check the cache before running any functions. Let’s create a single function to have this job:


function getResult(functionName, params, functionToRun) {
  var cacheKey = getCacheKey(functionName, params);

  var result = cache[cacheKey];

  if(!_.isUndefined(cache[cacheKey]) {
    // Successful cache hit! Return what we've got
    return result;
  }

  result = functionToRun.apply(this, params);

  cache[cacheKey] = result;

  return result;
}

Our getResult function does the job of checking the cache, and only actually executing our function if nothing is found in the cache. If it has to execute our function, it stores the result in the cache.

It parameters are:

  • functionName – just a string which is the function name
  • params – an array of parameters that will be used to build the cache key, as well as being passed to the function that may need to be run. The order of these parameters matters and should match the order in which the function that were trying to cache consumes them
  • functionToRun – this is the actual function that needs to be run,

Our getResult function is now in place. So let’s wire up getCalendarSettings with it:


function getCalendarSettings(month, year) {
  return getResult('getCalendarSettings', [month, year], runGetCalendarSettings);

  function runGetCalendarSettings(month, year) {
    var calendar = getCurrentCalendar();

    // Just a call to underscore to do some filtering
    var year = _.findWhere(calendar.years, { year: year});

    var month = _.findWhere(year.months, { month: month});

    return month;
   }
 }

 

We’ve now updated getCalendarSettings to call getResult and instead return the result of that function. We’re also exploiting JavaScript’s variable hoisting to use the runGetCalendarSettings function before it has been declared. Our function is now fully wired up with our in memory cache, and we’ll save unnecessary computation that has already been previously completed.

Further improvements

This code could be improved upon by:

  • Only storing copies of results in the cache. If a function returns an object and that gets stored in the cache, we can risk mutating the object as we we’re storing a reference to it. This can be done using underscore’s clone function.
  • Having the code evaluate what the calling function’s name is. This would get rid of the need for the functionName parameter.
  • Storing the cache elsewhere. As it’s being held in memory, it’ll get lost on the client as soon as the site is unloaded. The only real option for this is to use local storage, but even then I’d only recommend writing and reading from local storage when the application is loaded and unloaded. If this code is being used on the server, there are a lot more options for storing the cache.

Full code listing:

 


function calendarService() {
  var cache = {};

  function getCalendarSettings(month, year) {
    return getResult('getCalendarSettings', [month, year], runGetCalendarSettings);

    function runGetCalendarSettings(month, year) {
      var calendar = getCurrentCalendar();

      // Just a call to underscore to do some filtering
      var year = _.findWhere(calendar.years, { year: year});

      var month = _.findWhere(year.months, { month: month});

      return month;
    }
  }

  function getResult(functionName, params, functionToRun) {
    var cacheKey = getCacheKey(functionName, params);

    var result = cache[cacheKey];

    if(!_.isUndefined(cache[cacheKey]) {
      // Successful cache hit! Return what we've got
      return result;
    }

    result = functionToRun.apply(this, params);

    cache[cacheKey] = result;

    return result;
  }

  function getCacheKey(functionName, params) {
    var cacheKey = functionName;
 
    _.each(params, function(param) {
      cacheKey = cacheKey + param.toString() + '.';
    });
 
    return cacheKey;
  }
}

Node.Js, Performance

Running Node.js in production using all of the cores

Did you know that the JavaScript environment is single threaded? This means that in the Node.js world, the main event loop runs on a single thread. IO operations are pushed out to their own thread, but if you are doing a CPU intensive operation on the main thread (try not to do this), you can get into problems where your server stops responding to requests.

The proper way to get around this is by programming in such a way that is considerate as to how the Node.js / Javascript runtime works.

However, you should also be running your Node.js application in production with this in mind to get the absolute best performance that you can. To get around this limitation, you need to do some form of clustering.

Production Clustering

You can run your Node.js app in production by using the “node” command, but I wouldn’t recommend it. There are several Node.js wrappers out there that will manage the running of your Node.js app in a much nicer, production grade manner.

One such wrapper is PM2.

In it’s most basic form, PM2 (which stands for Process Monitor) will keep an eye on your Node.js app for any crashes, and will attempt to restart the process should it crash. Crucially, it can also be configured to run your Node.js app in a clustered mode, which will enable us to take advantage of all of the cores that are available to us.

PM2 can be installed globally via npm:


npm install pm2 -g

Helpfully, we don’t need to change any of our application code in order to have PM2 cluster it.

How many Workers?

PM2 has an optional argument – i, which is the “number of workers” argument. You can use this argument to instruct PM2 to run your Node.js app in an explicit number of workers:


pm2 start app.js -i 4

Where 4 is the number of workers that we want to launch our app in.

However, I prefer to set the “number of workers” argument to 0, which tells PM2 to run your app in as many workers as there are CPU cores:


pm2 start app.js -i 0

Et voilla!

Links