Lessons learned from a server outage

At just gone midnight this morning when I was nicely tucked up in bed, Uptime Robot sent me an email telling me that this very website, was not reachable.

I was fast asleep, and my phone was on do not disturb, so that situation stayed as it was until I woke up.

After waking up and seeing the message, I assumed that the docker container running the UI bit of this website had stopped running. Easy, I thought, I’ll just start up the container again, or I’ll just trigger another deployment and let my awesome GitLab CI pipeline do it’s thing.

Except this wasn’t the case. After realising, in a mild panic, that I could not even SSH onto the server that hosts this site, I got into contact with the hosting company (OneProvider) for some support.

I sent off a support message and sat back in my chair and reflected for a minute. Had I been a fool for rejecting the cloud? If this website was being hosted in a cloud service somewhere, would this have happened? Maybe I was stupid to dare to run my own server.

But as I gathered my thoughts, I calmed down and told myself I was being unreasonable. Over the last few years, I’ve seen cloud based solutions in various formats, also fail. One of the worst I experienced was with Amazon Redshift, when Amazon changed an obscure setup requirement meaning that you essentially needed some other Amazon service in order to be able to use Redshift. I’ve also been using a paid BitBucket service for a while with a client, which has suffered from around 5 outages of varying severity in the last 12 months. In a weird co-incidence, one of the worst outages was today. In comparison, my self hosted GitlLab instance has never gone down outside of running updates in the year and a half that I’ve had it.

Cloud based or on my own server, if an application went down I would still go through the same support process:

  • Do some investigation and try to fix
  • Send off a support request if I determined the problem was to do with the service or hosting provider

Check your SLA

An SLA or a Service Level Agreement basically outlines a service provider’s responsibilities for you. OneProvider’s SLA states that they will aim to resolve network issues within an hour, and hardware issues within two hours for their Paris data centre. Incidentally, other data centres don’t have any agreed time because of their location – like the Cairo datacenter. If they miss these SLAs, they have self imposed compensation penalties.

Two hours had long gone, so whatever the problem was, I was going to get some money back.

Getting back online

I have two main services running off of this box:

I could live with the link archive going offline for a few hours, but I really didn’t want my website going down. It has content that I’ve been putting on here for years, and believe it or not, it makes me a little beer money each month though some carefully selected and relevant affiliate links.

Here’s where docker helped. I got the site back online pretty quickly simply by starting the containers up on one of my other servers. Luckily the nginx instance on that webserver was already configured to handle any requests for, so once the containers were started, I just needed to update the A record to point at the other server (short TTL FTW).

Within about 20 minutes, I got another email from Uptime Robot telling me that my website was back online, and that it had been down for 9 hours (yikes).

Check your backups

I use a WordPress plugin (updraft) to automate backups of this website. It works great, but the only problem is that I had set this up to take a backup weekly. Frustratingly, the last backup cycle had happened before I had penned two of my most recent posts.

I started to panic again a little. What if the hard drive in my server had died and I had lost that data forever? I was a fool for not reviewing my backup policy more. What if I lost all of the data in my link archive? I was angry at myself.

Back from the dead

At about 2pm, I got an email response from OneProvider. This was about 4 hours after I created the support ticket.

The email stated that a router in the Paris data centre that this server lives in, had died, and that they were replacing it and it would be done in 30 minutes.

This raises some questions about OneProvider’s ability to provide hosting.

  • Was the problem  highlighted by their monitoring, or did they need me to point it out?
  • Why did it take nearly 14 hours from the problem occurring to it being resolved?

I’ll be keeping my eyes on their service for the next two months.

Sure enough, the server was back and available, so I switched the DNS records back over to to point to this server.

Broken backups

Now it was time to sort out all of those backups. I logged into all of my servers and did a quick audit of my backups. It turns out there were numerous problems.

Firstly, my GitLab instance was being backup up, but the backup files were not being shipped off of the server. Casting my memory back, I can blame this on my Raspberry Pi that corrupted itself a few months ago that was previously responsible for pulling backups into my home network. I’ve now setup Rsync to pull the backups onto my RAIDed NAS.

Secondly, as previously mentioned, my WordPress backups were happening too infrequently. This has now been changed to a daily backup, and Updraft is shunting the backup files into Dropbox.

Thirdly – my link archive backups. These weren’t even running! The database is now backed up daily using Postgres’ awesome pg_dump feature in a cron job. These backups are then also pulled using Rsync down to my NAS.

It didn’t take long to audit and fix the backups – it’s something I should have done a while ago.

Lessons Learned

  • Build some resilience in. Deploy your applications to multiple servers as part of your CI pipeline. You could even have a load balancer do the legwork of directing your traffic to servers that it knows are responding correctly.
  • Those servers should be in different data centres, possibly with different hosting providers.
  • Make sure your backups are working. You should check these regularly or even setup an alerting system for when they fail.
  • Make sure your backups are sensible. Should they be running more frequently?
  • Pull your backups onto different servers so that you can get at them if the server goes offline
JavaScript, Performance

Improving performance through function caching in JavaScript

I was recently profiling a single page application using Chrome’s dev tools looking for areas of slowness. This particular app did lots of work with moment.js as it had some complex custom calendar logic. The profiling revealed that the application was spending a lot of time in moment.js, and that the calls to moment.js were coming from the same few functions.

After debugging the functions that were calling into moment.js, it became apparent that:

  • These functions were getting called a lot
  • They were frequently getting called with the same parameters

So with that in mind, we don’t really shouldn’t be asking moment.js (or any function) to do the same calculations over and over again – instead we should hold onto the results of the function calls, and store them in a cache. We can then hit our cache first before doing the calculation, which will be much cheaper than running the calculation again.


So, here is the function that we’re going to optimise by introducing some caching logic into.  All code in this post is written in ES5 style JavaScript and leans on underscore for some utility functions.

function calendarService() {
  function getCalendarSettings(month, year) {

    var calendar = getCurrentCalendar();

    // Just a call to underscore to do some filtering
    var year = _.findWhere(calendar.years, { year: year});

    var month = _.findWhere(year.months, { month: month});

    return month;

The above function calls out to another function to get some calendar settings (which was itself fairly expensive) before doing some filtering on the returned object to return something useful.

Creating the cache

Firstly, we need to have a place to store our cache.  In our case, storing the results of the functions in memory was sufficient – so lets initialise an empty, service wide object to store our cached data:

function calendarService() {
  var cache = {};

  function getCalendarSettings() {




Pushing items into the cache

When we add an item into the cache, we need a way of uniquely identifying it. This is called a cache key, and in our situation there will be two things that will uniquely identify an item in our cache:

  1. The name of the function that pushed the item into the cache
  2. The parameters that the function was called with

With the above in mind, let’s build a function that will generate some cache keys for us:

function getCacheKey(functionName, params) {
  var cacheKey = functionName;

  _.each(params, function(param) {

    cacheKey = cacheKey + param.toString() + '.';


  return cacheKey;


The above function loops through each parameter passed in as part of the params array, and converts it to a string separated by a full stop. This will currently only work with parameters that are primitive types, but you could put your own logic into handle objects that are parameters.

So, if we were to call getCacheKey like this:

getCacheKey('getCalendarSettings', [0, 2017]);

It would return:


Which is a string, and will be used as a cache key as it uniquely identifies the function called and the parameters passed to it.

We now have our in memory cache object, and a function that will create us cache keys – so we next need to glue them together so that we can populate the cache and check the cache before running any functions. Let’s create a single function to have this job:

function getResult(functionName, params, functionToRun) {
  var cacheKey = getCacheKey(functionName, params);

  var result = cache[cacheKey];

  if(!_.isUndefined(cache[cacheKey]) {
    // Successful cache hit! Return what we've got
    return result;

  result = functionToRun.apply(this, params);

  cache[cacheKey] = result;

  return result;

Our getResult function does the job of checking the cache, and only actually executing our function if nothing is found in the cache. If it has to execute our function, it stores the result in the cache.

It parameters are:

  • functionName – just a string which is the function name
  • params – an array of parameters that will be used to build the cache key, as well as being passed to the function that may need to be run. The order of these parameters matters and should match the order in which the function that were trying to cache consumes them
  • functionToRun – this is the actual function that needs to be run,

Our getResult function is now in place. So let’s wire up getCalendarSettings with it:

function getCalendarSettings(month, year) {
  return getResult('getCalendarSettings', [month, year], runGetCalendarSettings);

  function runGetCalendarSettings(month, year) {
    var calendar = getCurrentCalendar();

    // Just a call to underscore to do some filtering
    var year = _.findWhere(calendar.years, { year: year});

    var month = _.findWhere(year.months, { month: month});

    return month;


We’ve now updated getCalendarSettings to call getResult and instead return the result of that function. We’re also exploiting JavaScript’s variable hoisting to use the runGetCalendarSettings function before it has been declared. Our function is now fully wired up with our in memory cache, and we’ll save unnecessary computation that has already been previously completed.

Further improvements

This code could be improved upon by:

  • Only storing copies of results in the cache. If a function returns an object and that gets stored in the cache, we can risk mutating the object as we we’re storing a reference to it. This can be done using underscore’s clone function.
  • Having the code evaluate what the calling function’s name is. This would get rid of the need for the functionName parameter.
  • Storing the cache elsewhere. As it’s being held in memory, it’ll get lost on the client as soon as the site is unloaded. The only real option for this is to use local storage, but even then I’d only recommend writing and reading from local storage when the application is loaded and unloaded. If this code is being used on the server, there are a lot more options for storing the cache.

Full code listing:


function calendarService() {
  var cache = {};

  function getCalendarSettings(month, year) {
    return getResult('getCalendarSettings', [month, year], runGetCalendarSettings);

    function runGetCalendarSettings(month, year) {
      var calendar = getCurrentCalendar();

      // Just a call to underscore to do some filtering
      var year = _.findWhere(calendar.years, { year: year});

      var month = _.findWhere(year.months, { month: month});

      return month;

  function getResult(functionName, params, functionToRun) {
    var cacheKey = getCacheKey(functionName, params);

    var result = cache[cacheKey];

    if(!_.isUndefined(cache[cacheKey]) {
      // Successful cache hit! Return what we've got
      return result;

    result = functionToRun.apply(this, params);

    cache[cacheKey] = result;

    return result;

  function getCacheKey(functionName, params) {
    var cacheKey = functionName;
    _.each(params, function(param) {
      cacheKey = cacheKey + param.toString() + '.';
    return cacheKey;

Node.Js, Performance

Running Node.js in production using all of the cores

Did you know that the JavaScript environment is single threaded? This means that in the Node.js world, the main event loop runs on a single thread. IO operations are pushed out to their own thread, but if you are doing a CPU intensive operation on the main thread (try not to do this), you can get into problems where your server stops responding to requests.

The proper way to get around this is by programming in such a way that is considerate as to how the Node.js / Javascript runtime works.

However, you should also be running your Node.js application in production with this in mind to get the absolute best performance that you can. To get around this limitation, you need to do some form of clustering.

Production Clustering

You can run your Node.js app in production by using the “node” command, but I wouldn’t recommend it. There are several Node.js wrappers out there that will manage the running of your Node.js app in a much nicer, production grade manner.

One such wrapper is PM2.

In it’s most basic form, PM2 (which stands for Process Monitor) will keep an eye on your Node.js app for any crashes, and will attempt to restart the process should it crash. Crucially, it can also be configured to run your Node.js app in a clustered mode, which will enable us to take advantage of all of the cores that are available to us.

PM2 can be installed globally via npm:

npm install pm2 -g

Helpfully, we don’t need to change any of our application code in order to have PM2 cluster it.

How many Workers?

PM2 has an optional argument – i, which is the “number of workers” argument. You can use this argument to instruct PM2 to run your Node.js app in an explicit number of workers:

pm2 start app.js -i 4

Where 4 is the number of workers that we want to launch our app in.

However, I prefer to set the “number of workers” argument to 0, which tells PM2 to run your app in as many workers as there are CPU cores:

pm2 start app.js -i 0

Et voilla!


Lets Encrypt

Migrating letsencrypt SSL certificates to another server

If you’re seeing this post, you’re viewing my website from it’s new home.

Moving the code and the data needed to run this website was made easy by docker and a WordPress plugin called UpdraftPlus. For additional testing, I just tweaked my local hosts file to simulate the DNS change.

I also needed to move my SSL certs. I could have dropped the site back into plain http and requested the certs again using the certbot, but I decided against this as it would be more of a hassle.

You need to move the contents of two directories and one file in order to keep your site running SSL and so that certbot on the new server is aware of how to renew your website’s SSL certs.

1: Copy the folder;


2: Copy the folder:


3: Copy the file:


4. Copy the folder:


LetsEncrypt community discussion thread:

Moving and merging certs from server A to server B

Technical, Wordpress

Running WordPress in production – Security and Speed

WordPress gets a fair share of bad press.

Most of this bad press is centred around security concerns. Many of these concerns are valid, but need not be a concern of yours if you are intending to run WordPress in production. You just need to a responsible webmaster. In this post I’ll list out some tips that will make your WordPress install robust and fast.

1. Keep your WordPress Install up to date

This is the most important security concern that you need to have. WordPress even makes updating your install super easy. You don’t need to log into any servers, you just need to login to the admin tool and head over to the dedicated updates page. From there you can simply press a button to get all of your pluggins and WordPress itself updated. Once your install is fully updated, you’ll see a nice clean page like this, telling you that there is nothing to update:

WordPress update screen
WordPress update screen telling me I’m fully updated

In the same way that you keep your laptop or PC up to date, you should be keeping your WordPress install up to date.

2. Install WordFence

WordFence is a popular security plugin that will offer you some protection against trending attacks. It’s a plugin that you should install, but it does not absolve you of all of your security responsibilities. You should still be regularly updating your server’s OS and any libraries installed on it.

3. Use Askimet

Askimet is a very popular WordPress plugin that is essential if you allow commenting on your website through WordPress.  Askiment will block shedloads of spam posts to your site, and you won’t even need to look t them. I don’t trust 3rd party services like disqus, so this was an essential plugin for me.

4. Back your shit up

WordPress has a few moving parts. Some of those parts are held in files, others in a MySQL database. You could periodically back these two up manually, or you could take advantage of one of the great WordPress plugins that you can use to automate your backups and make it super easy. An excellent one is Updraft Plus. This plugin can be set to regularly backup your entire WordPress site and can even store the backups in a cloud file service like Dropbox.

5. Install a caching plugin

A cache plugin will improve the load speed of your site. It will save database calls, and will instead pull data directly from memory. A popular plugin is WP Super Cache. And remember, a quick load time can mean that search engines rank you higher, and your visitors will love you.

6. Install an image compressing plugin

Again, this will give you a speed advantage and will save on your bandwidth use. A popular plugin is WP Smush. This plugin can be set to batch compress all images in your site, and can be used to compress images as and when they are added to your site.

7. Minify your JavaScript and CSS

Depending on how you’ve built your WordPress site will affect how you do this. If you have customised your WordPress templates or made your own theme, you should introduce a step in your build process to bundle and minify your JS and CSS.

If you are just using a 3rd party theme that you haven’t customised a lot, you should grab a plugin to bundle and minify your JavaScript and CSS. A plugin that I’ve had some success with is Better WordPress Minify. You may have to tweak it’s settings slightly to make sure it doesn’t break any of your other plugins that are rendered out on the UI (e.g. a source code highlighter plugin).

8. Use the latest version of jQuery

The standard install of WordPress doesn’t use the latest version of jQuery. Depending on the user’s that you’d like support, you may want to update to the latest version of jQuery.  You can do this in your build process, or you can do this with a plugin, like jQuery updater.


Broadband, Consumer

The current state of broadband and mobile data in the UK

I’ve always watched the Broadband and Mobile markets in the UK, largely from a consumer point of view. This has mainly to have been to get the fastest internet access at the lowest price.

Market Competition

Over the last decade or so, we saw a worrying trend in the Broadband market – we lost a lot of competition. This happened as a few bigger corporations entered the broadband market and consolidated their market share by buying up and closing smaller and often very good broadband operators.

Remember Bulldog internet? Well, they got eaten by TalkTalk. Remember BE internet? They got eaten by O2. Who then got eaten by Sky.

Bulldog and BE internet were once, very well regarded and popular internet providers. I’ll let you do your own research on what TalkTalk and Sky’s customers currently think of them.

Over the last year or so, this trend has reversed a bit, and we’ve had a few of the newer entrants trying to push themselves in, for example, EE and Vodafone.

Want a Broadband and Mobile combo? Get stuffed.

EE and Vodafone are mobile network operators and that is where they do the majority of their business. Both offer some fairly competitive broadband packages, but for some odd reason, choose not to bundle anything else in their broadband packages. So two massive companies that offer mobile phone and internet services, don’t offer any packages that link the two. Huh?

I cannot understand why they would not do this. Consumers would benefit from getting better deals, and EE and Vodafone would benefit by getting customers that were more embedded into their services. The broadband and mobile services offered by these businesses are essentially treated as two separate entities. When I couldn’t find any combined broadband and mobile deals online, I reached out to their online sales staff. Both EE’s and Vodafone’s sales responded with “You’re talking to broadband sales, I can’t help you with mobile sales”.

I eventually reached out to both companies on twitter – EE actually will throw 5 gigs of data onto your phone package, but that isn’t great for someone like me – and they don’t actually shout about that offer anywhere.

EE – you are missing a trick. Vodafone – you are missing a trick. Get some packages that link the two and train your staff on all consumer products. Don’t treat your broadband and mobile offer as two totally different things. As a potential customer, don’t bounce me between departments if I want to talk about buying broadband and mobile.

In Europe, many people use the same provider for their TV, broadband, and family mobile packages. There is no reason why this sort of offer wouldn’t be as popular in the UK.

So what about mobile data?

So, we’ve now got a pretty good 4G network up and down the country – however unlimited mobile data packages have become rare and expensive.

I’m currently on an old three unlimited data package. It costs me £23 a month. If I wanted to take out that package now, it would cost me £30. When I took my package out – it was one of the most expensive. It’s now one of the cheapest.

Worryingly, three are now traffic shaping and chipping away at net neutrality by offering up packages that have data limits, but let you access some services in an unlimited fashion. They call it “Go Binge“, and claim that if offers you access to Netflix and some other smaller TV streaming services. They are treated as an option on mobile packages:

I’d rather they were just into the business of offering up data, not offering up *some* data. Also, this is starting to look like some of the mobile phone contracts offered up in countries where there are no net neutrality laws.

Facebook, twitter and whatsapp only unlimited in certain packages

Currently no one offers up unlimited data except for three – and that’ll cost you £33 a month.

To conclude

Data has gotten more expensive on mobiles. We’ve got more big companies offering broadband, but aren’t using their significant market presence in other areas to offer up better deals.


gitlab, Technical

Reducing the amount of memory used by gitlab

Gitlab is a fantastic tool. Rather than going with a saas solution for source control and for continuous integration, I’d thoroughly recommend hosting your own gitlab instance. Don’t be scared!

Anyway, I run my own gitlab instance on a box that only has 4 gigs of ram. Gitlab also has to share these limited resources with a few other webapps.

I noticed that gitlab was one of the biggest consumers of the ram on my box, and did some research into reducing it’s memory footprint.

Open the gitlab config file, which should be located at /etc/gitlab/gitlab.rb.

Reduce the postgres database cache

##! **recommend value is 1/4 of total RAM, up to 14GB.**
postgresql['shared_buffers'] = "256MB"

Reduce the concurrency level in sidekiq

I set this at 15 instead of 25 as I don’t have that many commits going on.

sidekiq['concurrency'] = 15 #25 is the default 

Disable prometheus monitoring

prometheus_monitoring['enable'] = false

Restart gitlab and test it out:


gitlab-ctl reconfigure

You should then run through a few commits and check gitlab is running smoothly.


Self hosted wordpress vs free wordpress

I’ve maintained this blog since 2008. Since 2008, it had been hosted on, and I was paying around £12 a year for the domain mapping. That allowed me to point my domain ( at my hosted site.

I was reasonably happy with the service I got.

  1. It was cheap
  2. I didn’t have to worry about hosting (backups, uptime)
  3. I was quick to get going

However, there are some downsides when you don’t host yourself:

No full administrative control over WordPress

One of the awesome things about WordPress is the amount of themes and plugins that are out there. When using the hosted platform at, you do not have full administrative control over wordpress, so you can’t just install some of the plugins as you wish. And those that use wordpress a lot, know that there are some essential plugins, like WP Smush.

Additional features that are free when you self host, cost money on

If you want to install a non standard theme on a hosted site, you can’t. You can however, pay for the option to install one of their premium themes. So you can’t really style your site in the way you want, without getting your wallet out.

Also – ads. hosted sites “occasionally” show ads to users. Here’s the thing – I really, really distrust ad networks. Aside to opening your site up to becoming a vector for Malvertising attacks and the creepy level of ubiquitous tracking,  I also really dislike just how invasive ads on the web have become. I understand the need to monetise content on the web, but there are better ways of doing it rather than just indiscriminately littering ads around content.

In fact, this site is itself monetised where appropriate. Some articles contain useful and relevant affiliate links – but this may actually have contravened’s terms and conditions. So I was also risking my site randomly getting yanked offline.

Performance on isn’t great

I’m a web developer. It’s what I do, day in, day out. I want everything that I do to follow web best practices – and a site hosted on will not. Opening up the developer tools network tab in Chrome, and hitting a hosted site, will reveal a few things. Aside from A LOT of requests for tracking assets, there are several requests for unminified javascript files. Like this.

The alternatives

Other hosts

There are a few of these about, but I’ve really gone off cloud based solutions and didn’t want to spend hours researching other providers.

Other blogging engines

I looked at a few, but saw that the migration path would be painful, especially if self hosted. isn’t self hosted. Ghost can be self hosted but isn’t anywhere as easy as self hosting wordpress. It’s also funny that the ghost vs wordpress page says “Ghost is simple!”, and the ghost vs medium page says “Ghost is powerful!”.

I do not trust a paid blog site to keep it’s pricing structure as is. I really don’t want to be in the position where I need to suddenly pay up more money to host or to frantically have to migrate because some company decided to change their pricing structure.

So here we are, still running on wordpress, but this time we’re self hosted. The migration was easy, and took me about 2 hours.

But wordpress isn’t secure!

I hear you, along with everyone else that has been sucked up by the technology hype lifecycle. WordPress does indeed get bashed a bit because there is an unfair perception of security problems around it.  There are some things you should be doing if you are running a wordpress site in production to make it more secure. I’ll address these things in a later blog post, but many of them will just be standard web security best practices.

Technical, Wordpress

Running WordPress behind a reverse SSL proxy

Newer versions of WordPress really don’t need much to get working behind an SSL proxy.

I currently have an NGINX webserver running infront of this blog. The job of NGINX here is to handle the SSL traffic, decrypt it, and forward it onto the docker container that runs this blog in plain old http.

If you’re going to do this, you need to make sure your NGINX config is setup to send the right headers through to wordpress, so that wordpress knows about the scheme the traffic came in on. So, in your NGINX config file, you’ll need the following:

 location / {
   proxy_http_version 1.1;
   proxy_set_header X-Forwarded-Host $host;
   proxy_set_header X-Forwarded-Server $host;
   proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
   proxy_set_header X-Forwarded-Proto $scheme;
   proxy_set_header X-Real-IP $remote_addr;
   proxy_set_header Host $host;

That should be all you need. WordPress has been around, and older blog posts seem to indicate that you may need some additional plugins. I didn’t find that this was the case. Hope this helps.

Surface Book

Adjusting screen brightness on the Surface Book from the Keyboard

I’ve purchased a Microsoft Surface Book to replace my Mac Book Pro. I didn’t get on very well with the Mac Book Pro for reasons that I will list out in a future blog post, but so far I am very happy with the Surface Book. It’s build quality feels fantastic and it is a lovely machine to use.

It did however take a me a while to workout how to adjust the screen brightness from the keyboard after noticing that none of the function keys double up as a screen brightness adjustment.

To make your screen brighter:

Fn + Del

To make your screen darker:

Fn + Backspace