Nodejitsu

Save time managing and deploying your node.js app. Code faster with jitsu and npm

Scaling npm, January 2014

About the author

Name
Location
Worldwide
nodejitsu nodejitsu

This is our second post in our commitment to bringing transparency to our operation of the public npm registry. For those of you who want to learn more: our first post on #scalenpm is here.

Here’s a quick rundown of what we’ll be talking about today:

Just how does npm work anyway?

Before we can really explain what we’re doing to improve the quality of service of the public npm registry, it’s important to understand all the moving parts that run it. This was touched upon in our original post-mortem on the Node.js blog. The service we call “npm" is really two components:

  • http://www.npmjs.org: The npmjs.org website that you interact with using a web browser. It is a Node.js program (Github: npm/npm-www) maintained and operated by (now) npm Inc. and running on a Joyent Public Cloud SmartMachine.
  • http://registry.npmjs.org: The main CouchApp (Github: isaacs/npmjs.org) that stores both package tarballs and metadata. It is operated by Nodejitsu since we acquired IrisCouch in May. The primary system administrator is Jason Smith, the current CTO at Nodejitsu, co-founder of IrisCouch, and the System Administrator of registry.npmjs.org since late 2010. It’s a pretty vanilla CouchDB 1.5 setup with an Erlang reverse proxy in-front of it that ensures high availability.
High-level npm architecture -- Red-lines denote continuous replication

But it’s actually a little more complicated than that Since late December, the public npm registry has been fronted by Fastly. There is a longer explanation of this on the npm Inc. blog, but here’s the run down:

  • All HTTP requests go through Fastly.
  • All JSON documents are fetched from CouchDB and cached for 1 second.
  • All package tarballs are fetched from Joyent’s Manta service cached for longer than that (exact duration unknown).
  • Package tarballs are populated into Manta using the mcouch module.

With all these moving parts it’s hard to keep it all straight! So here’s a diagram to help make sense of it all:

Complete npm architecture -- Red-lines denote continuous replication

What’s new in January?

We had three big goals in January, and we were able to meet all three of them!

Get our bandwidth costs under control

In the period from October to November the public npm registry used about 40TB of bandwidth. In the period from November to December it used over 140TB of bandwidth. We’ve dug into this through analyzing the npm registry masters and the only explanation we have is additional, natural traffic to the registry.This led to a pretty sizable bandwidth overage (~$12k) on our SoftLayer bill for the public npm registry. We were able to get down to just $1,200 in January by pooling all the bandwidth allocated to the servers in a Virtual Rack and purchasing additional bandwidth before we went over.

More Open Source!

Our CouchDB replication monitor, overwatch, is now open source! If you’re looking for more details on how this works why don’t you check-out the Github repository?.

In addition to releasing our replication watcher, we also released smart-private-npm which is an easy way for you and your organization to start running your own private npm registry today!

If you don’t want to worry about the headache of running your own servers: you could always sign-up for one of our new hosted private npm registries.

Scale-out our multi-master HA clusters

We spent much of our time in December transitioning to new consistent SSD hardware and battle-hardening high-availability via horizontally scaled multi-master clusters. For January we wanted to put this new architecture to the test by adding a third master to the mix and automating it’s supervision.

This unfortunately led to an unplanned outage that we covered in our post-mortem last week. Right now the registry is running with three masters (work1, work2, and work3) with the old supervision code. We will be rolling out the new supervision code on Thursday, January 30th, at 1am ET (UTC-5).

How is the #scalenpm fund being spent?

We know that answering this question in a transparent way is important to everyone using the public npm registry. With that in mind we’re going to share as much information here as possible. The funds raised from the Node.js community during the #scalenpm fund raiser are being spent in three key areas:

with a full total cost thus far of $179,049.00. Lets examine each one of these in detail:

Servers

Total Costs through January 2014: $36,084.00

  • November 2013: $5,649.88
  • December 2013: $18,821.50
  • January 2013: $11,612.62

Systems Administrators

Total Costs through January 2014: $78,780.00

A wise person once said:

“Happiness is lots of systems administrators"

According to the 2013 compensation data of 19 startups collected by FirstRound Capital, the average salary for a Senior Systems Administrator is $135,000.00 USD annually. Thus budgeting for two systems administrators who are at the beck and call of the public npm registry 365x24x7 we arrive at a monthly cost of $26,260.00

These admins literally give potentially every waking moment to the correct operation of the public npm registry so we at Nodejitsu have no qualms making sure they’re well compensated.

Sponsor Fulfillment

Total Estimated Costs: $64,185.00

Excited about getting your npm swag? So are we! Right now we’re waiting for our Bronze, Silver, Gold, and Adamantium sponsors to send in their additional swag to us to be included in the mailing. We should have all that gathered by early February, which is when we’ll start mailing out to all of you! In the meantime: keep your eyes peeled for your hosting coupon in your inbox this week.

Wondering how we arrived at this number? Well the bulk of it came from the 868 hosting coupons, 672 mailings and t-shirts. Not to mention the hoodies, stickers, and additional goodies everyone got!

What's next?

Our current traffic capacity estimates have us safely operating on three masters through the middle of March. That finally gives us some breathing room to decide on how to handle the next phase of exponential growth for the public npm registry.

Can’t get enough npm? There is an npm NodeUp show tentatively scheduled to record next Thursday featuring @isaacs, @seldo, @janl, @dshaw and myself. Hope you can tune in!