Hacking Browserify

You’ve probably heard of Browserify. It’s a nice npm module for bundling your JavaScript for a client-side usage. It lets you use similar to Node.js modular system but for the code running in the browser. I had few issues testing modules in an app that uses Browserify. So I had to learn how it works and probably hack it in order to solve my problem.

In this article we will dig into the output of Browserify and will see how to eventually stub (or mock) modules.

The example

Let’s create a simple app and use Browserify to create the bundle. Here are the files:

// data.js
module.exports = {
  firstName: 'Jon',
  familyName: 'Snow'
}


// user.js
var data = require('./data');
module.exports = {
  getName: function() {
    return data.firstName + ' ' + data.familyName;
  }
}


// app.js
var user = require('./user');
console.log(user.getName());

The entry point of our application is app.js. We fetch the user object and display the name. Internally user.js has one dependency - data.js. We may run the following code in the terminal and will get Jon Snow as an output:

> browserify app.js -o bundle.js
> node bundle.js
> Jon Snow

The problem

I need to stub data.js module and return another object. That’s not possible by default.

How Borwserify works

If we open bundle.js we’ll see the following code:

(function e(t,n,r){function s(o,u){if(!n[o]){if(!t[o]){var a=typeof require=="function"&&require;if(!u&&a)return a(o,!0);if(i)return i(o,!0);var f=new Error("Cannot find module '"+o+"'");throw f.code="MODULE_NOT_FOUND",f}var l=n[o]={exports:{}};t[o][0].call(l.exports,function(e){var n=t[o][1][e];return s(n?n:e)},l,l.exports,e,t,n,r)}return n[o].exports}var i=typeof require=="function"&&require;for(var o=0;o<r.length;o++)s(r[o]);return s})({1:[function(require,module,exports){
var user = require('./user');
console.log(user.getName());
},{"./user":3}],2:[function(require,module,exports){
module.exports = {
  firstName: 'Jon',
  familyName: 'Snow'
}
},{}],3:[function(require,module,exports){
var data = require('./data');
module.exports = {
  getName: function() {
    return data.firstName + ' ' + data.familyName;
  }
}
},{"./data":2}]},{},[1]);

We definitely see our modules inside but some other stuff that seem like obfuscated code. What I did was passing the output to prettyfier(http://jsbeautifier.org/ does the job). Pasting the code there makes it readable. Now we have the following:

(function e(t, n, r) {
    function s(o, u) {
        if (!n[o]) {
            if (!t[o]) {
                var a = typeof require == "function" && require;
                if (!u && a) return a(o, !0);
                if (i) return i(o, !0);
                var f = new Error("Cannot find module '" + o + "'");
                throw f.code = "MODULE_NOT_FOUND", f
            }
            var l = n[o] = {
                exports: {}
            };
            t[o][0].call(l.exports, function(e) {
                var n = t[o][1][e];
                return s(n ? n : e)
            }, l, l.exports, e, t, n, r)
        }
        return n[o].exports
    }
    var i = typeof require == "function" && require;
    for (var o = 0; o < r.length; o++) s(r[o]);
    return s
})({
    1: [function(require, module, exports) {
        var user = require('./user');
        console.log(user.getName());
    }, {
        "./user": 3
    }],
    2: [function(require, module, exports) {
        module.exports = {
            firstName: 'Jon',
            familyName: 'Snow'
        }
    }, {}],
    3: [function(require, module, exports) {
        var data = require('./data');
        module.exports = {
            getName: function() {
                return data.firstName + ' ' + data.familyName;
            }
        }
    }, {
        "./data": 2
    }]
}, {}, [1]);

Approximately fifty lines of code. If we skip the details for now we’ll notice that there is actually one big function that is called with several parameters. Something like this:

(function() {
  ..... scary logic
})({
  ..... our modules here
}, {}, [1]);

Our modules are organized in the following way:

{
  1: [
    function(require, module, exports) {
      // our code
    }, 
    { "./user": 3 }
  ],
  2: [
    function(require, module, exports) {
      // our code
    }, 
    {}
  ],
  3: [
    function(require, module, exports) {
      // our code
    }, 
    { "./data": 2 }
  ]
}

Every module is attached to a key - 1, 2 and 3. We see also that every module is an array of two items. The first one is a closure and our code is placed inside. The second one is an object containing references to the dependencies of the module. So our app.js for example has a key of 1 and contains one dependency user.js which is attached to key 3.

1: [
  function(require, module, exports) {
    // our code
  }, 
  { "./user": 3 }
]

The other important bit here is the arguments that land in the scope of our code. We have require, module and exports. That’s how we get these global goodies an we feel that our code runs in Node.js environment.

We have a map containing our modules. Let’s see how the magic happens. The scary function from the snippet above is not that complex. If we skip the body of the s function we’ll get:

(function e(t, n, r) {
  function s(o, u) {
    .....
  }
  var i = typeof require == "function" && require;
  for (var o = 0; o < r.length; o++) s(r[o]);
  return s
})({
    // our modules
}, {}, [1]);

t is the object that we talked about above. The one that contains our modules. n is an empty object and r is equal to [1]. We first check if there is a global require function and if yes then we use assign it to a variable i. If not then i is falsy. After that we loop through the passed r array. In our case we have only one item which is 1. As you may guess that’s the key of the module that the app starts from. The entry point of our script. 1 in our bundle means app.js.

Let’s explaing the s function. In the very beginning it is called with only one argument 1. I’ll replace o with 1 so we follow the logic easily.

function s(o, u) {
  if (!n[o]) {
    if (!t[o]) {
      var a = typeof require == "function" && require;
      if (!u && a) return a(o, !0);
      if (i) return i(o, !0);
      var f = new Error("Cannot find module '" + o + "'");
      throw f.code = "MODULE_NOT_FOUND", f
    }
    var l = n[o] = {
      exports: {}
    };
    t[o][0].call(l.exports, function(e) {
      var n = t[o][1][e];
      return s(n ? n : e)
    }, l, l.exports, e, t, n, r)
  }
  return n[o].exports
}

The function begins with a check if (!n[o]) { that we will talk about in a bit. Initially there is no n[1] so we move forward. t is the object that contains our modules. So if there is no t[1] then we have a missing module. The code nicely throws MODULE_NOT_FOUND error in this case. The next three lines define an object l and populates the same value in n[1]. Next time when someone requires the same module with index 1 we will provide that object directly. In the end of the function we have return n[o].exports and the check that we skipped in the beginning will be falsy. This is a cache mechanism so we don’t have to resolve same module again and again.

The last part is the actual execution of our code:

t[o][0].call(l.exports, function(e) {
  var n = t[o][1][e];
  return s(n ? n : e)
}, l, l.exports, e, t, n, r)

l.exports is the context (the local this) inside our module. If we go back we will see that the signature of the closure containing our code is function(require, module, exports). Let’s dig a little bit here.

require is as follows:
```
function(e) {
  var n = t[o][1][e];
  return s(n ? n : e)
}
```
Fetching the index of the required module and again calling the s function. For example if e is ./user and t[o][1] is { "./user": 3 } we will get n=3.
module is l which is equal to { exports: {} }
exports is the same as l.exports
and we have bunch of other arguments that were not listed in the signure. However, there are the key to our hack and they allows us to stub modules.

The Hacking

(If you are not famililar with stubbing check out this link)

So it looks like our code is run inside a closure. For example app.js:

function(require, module, exports, e, t, n, r) {
  var user = require('./user');
  console.log(user.getName());
}

We know what e, t, n and r are for. Especially interesting is t. That’s the object that contains all the modules in the bundle. So in theory we could write the following:

arguments[4][2][0] = function(require, module, exports) {
  module.exports = {
    firstName: 'Robb',
    familyName: 'Stark'
  }
};
var user = require('./user');
console.log(user.getName());

arguments[4] gives us the t object. We know from the bundle that 2 is the index of our data.js module and its [0] element is the closure that wraps our code. We may easily replace it and return different object.

Of course arguments[4][2][0] is not really nice. We can’t open the bundle and read the source every time. That’s why I wrapped that little hack in a npm module - hackerify. The above code may be replaced by:

var Hackerify = require('hackerify');
Hackerify(arguments, {
  './data': {
      firstName: 'Robb',
      familyName: 'Stark'
    }
});
var user = require('./user');
console.log(user.getName());

If we run the browserify command again we will get the following output:

> browserify app.js -o bundle.js
> node bundle.js
> Robb Stark

Drawback

Of course, as every hack, the solution is not ideal. There is a case where our tricky code is not working. It is possible to define global scope variables in Browserify. For example:

global.KingsLanding = function() {
  return 'The capital of the Seven Kingdoms';
}
var user = require('./user');
console.log(user.getName());

Unfortunately that results to:

1: [
  function(require, module, exports) {
    (function(global) {
      global.KingsLanding = function() {
        return 'The capital of the Seven Kingdoms';
      }

      var user = require('./user');
      console.log(user.getName());
    }).call(this, typeof global !== "undefined" ? global : typeof self !== "undefined" ? self : typeof window !== "undefined" ? window : {})
  }, 
  { "./user": 3 }
]

Our code is placed not in one but two closures because of the global object. In this case the arguments are not what we expect. There is no arguments[4] and our hack fails.

Conclusion

In the end I decided to NOT use that hack in my codebase. First because it doesn’t work if we have globally defined variables. Second because it’s tight to the current version of Browserify. What if the new version drops the additional parameters and only sends require, module and exports to my module.

The lesson for me here is “if you need stubbing a browserify module then your design is broken”. The application may be designed like that so we avoid such cases. Placing bunch of requires at the top of your file leads to poor testability. Future me, avoid that please.

And of course digging into Browserify’s bundle was fun.