Nodejitsu

Save time managing and deploying your node.js app. Code faster with jitsu and npm

Versions: The Node.js Content Delivery Network

About the author

Name
Location
Worldwide
nodejitsu nodejitsu

At Nodejitsu we focus on providing our users with the best developer
experience possible -- something we can't achieve if our sites take too long to load.
This is why we focused on front-end performance when we rebuild our new
front-page. The best front-end performance can be achieved by loading a minimal quantity of resources as fast as possible. So having a good static server
is vital! Unfortunately serving static files isn't something Node.js
is designed for. For example it's missing essential sendfile
bindings which allow you to transfer files over sockets with minimal context switching. The only way to create a high performance file server in Node is to cache aggressively.

Today we're pleased to announce the release of my latest open source project, versions.

Versions allows you to create a small scale content delivery network using Node.js. It
gets its performance by caching file lookups in memory and setting
aggressive caching headers. The fastest request is one that you don't have to
make! The server comes with a wide range of features that most
public paid CDN's also support:

  • Origin pull, the ability to fetch resources from remote servers and cache them on the server.
  • Cache Headers, versions will automatically configure cache headers for you.
  • Advanced & automatic GZIP, It gzips your resources automatically for browsers that support it (except IE5/6) and detects obfuscated gzip headers as researched by the Yahoo Performance team.
  • REST API, providing a REST API to manage the server.
  • Clusters, when ran in clusters it can synchronize configuration and other details between servers using a Redis engine.

Creating the server is really simple:

// Require the versions module
var versions = require('versions');

// Setup some origin servers where we want to pull the content from and tag them
// with an id, optionally setup some syncing to be ready to roll.
versions.set('origin servers', [  
    { url: 'https://www.nodejitsu.com', id: 'home' },
    { url: 'https://webops.nodejitsu.com', id: 'webops' },
    { url: 'https://gravatar.com/userimage/', id: 'gravatar' }
  ])
  .set('sync', true)
  .set('redis', { host: 'localhost', port: '6379', auth: 'pass' })
  .listen(8080);

As you can see in the snippet above, we are setting 3 origin servers where we
will pull our assets from. This allows us, for example, to fetch our base CSS from
https://www.nodejitsu.com and another CSS file from https://webops.nodejitsu.com
and cache our Gravatar look ups:

  • http://example.com/css/base.css would only load the base.css from our home origin as it's the only server that contains this file.
  • http://example.com/css/foo.css would be loaded from the WebOps origin as it doesn't exist on the home server.
  • http://example.com/15003245/847041ab43d45ec001bd9ef611f2184c.jpg?size=40 would load an avatar from Gravatar as both WebOps and home doesn't have this file.

Cache busting

The problem with aggresive caching is that it's harder to serve your users
fresh content when you make changes to your assets. If left unattended users
will see the cached version of their browser. The solution? Implement a cache
busting scheme. As query strings are known to cause cache issues with reverse
proxies we decided to prefix our files using a custom path instead. This would
allow us to "bust" browser cache by referencing the file from a
different path. In versions you can prefix your assets with a special
/versions:<cache bust string> string, for example http://example.com/versions:daf3daf87Adf/base.css would be internally rewritten to http://example.com/base.css

This introduces consistency problems since you might have different applications
that use the same CDN and assets. To ensure all your applications are
referencing files with identical cache busters we have created an API client.
The API can sync version strings between different connected clients.

// Require the versions module.
var versions = require('versions');

// Enable syncing using the .set(config, value) API and connect to your versions
// instance using the .connect method.
versions.set('sync', true)  
  .set('redis', { host: 'localhost', port: '6379', auth: 'pass' })
  .connect('http://example.com');

The client will sync its configuration with the server using the provided Redis
configuration and setup a pub/sub connection for constant version changes. In
order to "tag" your assets with the synced cache buster you can use the
versions.tag('path'); method. If you supply it with the path /foo/bar.js it will return http://example.com/versions:0.0.0/foo/bar.js.

Clusters & multiple origins

Versions has support for clusters out of the box -- at Nodejitsu we run it on
multiple drones. You only need to set the sync configuration
and supply it with your redis details and it's ready to go.

var versions = require('versions');

// Setup syncing like before
versions.set('sync', true)  
  .set('redis', { host: 'localhost', port: '6379', auth: 'pass' })
  .connect('http://example.com');

// Setup more alias servers so you can spread resources across different servers
// or to increase parallel download of resources
versions.alias('http://example.org')  
        .alias('http://example.net');

// Tag your assets on the fly using:
versions.tag('/filepath.js'); // http://example.com/versions:0.0.0/filepath.js  
versions.tag('/bar.css'); // http://example.net/versions:0.0.0/bar.css  

In addition to running clustered the API client also accepts multiple origin
servers. It will balance your assets between supplied servers using the
node-hashring module. Ensuring load and cache size are distributed equally. This comes with the benefit of parallel downloading as each host can
only download 4-8 assets at the same time.

So if you ever need a high performance static server for your Node.js
application feel free to take http://github.com/3rd-Eden/versions for a spin.

The CDN in Action

Both Nodejitsu's website and handbook are powered by our CDN. Over the last
weeks we integrated our documentation with the website. Formerly, our
documentation was accessible on a separate domain outside the website. We changed
our handbook module to a dependency of the website, to provide a more consistent
look and feel. In addition, we changed handbook from a blacksmith powered
static file server to a Markdown document collection.

Beside we think it looks better, thats nothing special right? But how
are we gonna serve images in the Markdown content without storing them remotely
or relying on some third party storage. This is where versions comes into play.
We can simply call https://versions.nodejitsu.com/resources/img.png to acquire
the image. If it is not yet available on our CDN it will be fetched from the
configured external resources. This allows us to keep content centralized. Also
handbook is explicitly about serving Markdown content. Versions will keep images in
cache and ensure our website will serve documents blazing fast.

The old handbook site loaded our WebOps documentation in a un-cached
state in 5.1 seconds. The new documentation does this in 3.67 seconds. With
primed cache you were still looking at 3.7 seconds for the old site and only
1.7 seconds for the new version. We think this isn't bad for a page full of big images.
So we saved about 2 seconds of load time by using versions.

Try it out for yourself! Want to see our CDN in action? Visit
nodejitsu.com or nodejitsu.com/documentation.