Subclassing builtins in ECMAScript 5

[2011-12-26] esnext, dev, javascript, jslang
(Ad, please don’t block)
JavaScript’s built-in constructors are difficult to subclass. This post explains why and presents solutions.

Terminology

We use the phrase subclass a builtin and avoid the term extend, because it is taken in JavaScript:
  • Subclassing a builtin A: Create a sub-constructor B of a given built-in constructor A. B’s instances are also instances of A.
  • Extending a builtin A: Adding new methods to A.prototype.
There are two obstacles to subclassing a builtin: First, instances with internal properties. Second, a constructor that can’t be called as a function.

Obstacle 1: instances with internal properties

Most built-in constructors have instances with so-called internal properties (whose names are written in double square brackets, for example: [[PrimitiveValue]]). Internal properties are managed by the JavaScript engine and usually not directly accessible in JavaScript. The normal subclassing technique in JavaScript is to call a super-constructor as a function with the this of the sub-constructor [2].
    function Super(x, y) {
        this.x = x;
        this.y = y;
    }
    function Sub(x, y, z) {
        // Add super-properties to sub-instance
        Super.call(this, x, y); // (*)
        // Add sub-property
        this.z = z;
    }
Most builtins ignore the sub-instance passed in as this (*), which is described below, as “obstacle 2”. Additionally, adding internal properties to an existing instance is in general impossible, because they tend to fundamentally change the instance’s nature. Hence, the call at (*) can’t be used to add internal properties. The following constructors have instances with internal properties:
  • Wrapper constructors: Instances of Boolean, Number and String wrap primitives. They all have the internal property [[PrimitiveValue]] whose value is returned by valueOf(). Additionally, String instances support indexed access of characters.
    • Boolean: internal property [[PrimitiveValue]]
    • Number: internal property [[PrimitiveValue]]
    • String: internal property [[PrimitiveValue]], custom method [[GetOwnProperty]], normal property length. [[GetOwnProperty]] accesses the wrapped primitive string when an array index is used.
  • Array: The custom internal method [[DefineOwnProperty]] intercepts properties being set. It ensures that the length property works correctly, by keeping length up to date when array elements are added or removed and by removing excess elements when length is made smaller.
  • Date: internal property [[PrimitiveValue]] stores the time represented by a date instance.
  • Function: internal property [[Call]] (the code to execute when the instance is called) and possibly others.
  • RegExp: internal property [[Match]] in addition to non-internal properties. Quoting the ECMAScript specification:
    The value of the [[Match]] internal property is an implementation dependent representation of the Pattern of the RegExp object.
The only built-in constructors that don’t have internal properties are Error and Object.

Work-around. MyArray is a sub-constructor of of Array. It has a getter size that returns the actual elements in an array, ignoring holes (where length counts holes). The trick used to implement MyArray is that it creates an array instance and copies its methods to it. Credit: inspired by a blog post by Ben Nadel [3].

    function MyArray(/*arguments*/) {
        var arr = [];
        // Don’t use the Array constructor which does not work for, e.g. [5]
        // (new Array(5) creates an array of length 5 with no elements in it)
        [].push.apply(arr, arguments);
        return copyOwnFrom(arr, MyArray.methods);
    }
    MyArray.methods = {
        get size() {
            var size = 0;
            for(var i=0; i < this.length; i++) {
                if (i in this) size++;
            }
            return size;
        }
    }
The above code uses the following helper function:
    function copyOwnFrom(target, source) {
        Object.getOwnPropertyNames(source).forEach(function(propName) {
            Object.defineProperty(target, propName,
                Object.getOwnPropertyDescriptor(source, propName));
        });
        return target;
    };
Interaction:
    > var a = new MyArray("a", "b")
    > a.length = 4;
    > a.length
    4
    > a.size
    2
Caveats. Copying methods to an instance leads to redundancies that could be avoided with a prototype (if we had the option to use one). Additionally, MyArray creates objects that are not its instances:
    > a instanceof MyArray
    false
    > a instanceof Array
    true

Obstacle 2: a constructor that can’t be called as a function

Even though Error and sub-constructors don’t have instances with internal properties, one still can’t subclass them easily, because the standard pattern for subclassing won’t work (repeated from above):
    function Super(x, y) {
        this.x = x;
        this.y = y;
    }
    function Sub(x, y, z) {
        // Add super-properties to sub-instance
        Super.call(this, x, y);  // (*)
        // Add sub-property
        this.z = z;
    }
The problem is that Error always produces a new instance, even if called as a function (*). That is, it ignores the parameter this handed to it via call():
    > var e = {}
    > Object.getOwnPropertyNames(Error.call(e))
    [ 'stack', 'arguments', 'type' ]
    > Object.getOwnPropertyNames(e)
    []
Error does return an instance with own properties, but it’s a new instance, not e. The subclassing pattern only works if Error adds the own properties to this (e, in the above case).

Work-around. Inside the sub-constructor, create a new super-instance and copy its properties to the sub-instance.

    function MyError() {
        // Use Error as a function
        var superInstance = Error.apply(null, arguments);
        copyOwnFrom(this, superInstance);
    }
    MyError.prototype = Object.create(Error.prototype);
    MyError.prototype.constructor = MyError;
Trying out the new error constructor:
    try {
        throw new MyError("Something happened");
    } catch (e) {
        console.log("Properties: "+Object.getOwnPropertyNames(e));
    }
Output on Node.js:
    Properties: stack,arguments,message,type
The instanceof relationship is as it should be:
    > new MyError() instanceof Error
    true
    > new MyError() instanceof MyError
    true
Caveat. The main reason for subclassing Error is to have the stack property in sub-instances. Alas, Firefox seems to store that value in an internal property, which is why the above approach does not work there (Firefox 8).

Another solution: delegation

Delegation is a very clean alternative to subclassing. For example, to create your own array constructor, you keep an array in a property:
    function MyArray(/*arguments*/) {
        this.array = [];
        // Don’t use the Array constructor which does not work for, e.g. [5]
        // (new Array(5) creates an array of length 5 with no elements in it)
        [].push.apply(this.array, arguments);
    }
    Object.defineProperties(MyArray.prototype, {
        size: {
            get: function () {
                var size = 0;
                for(var i=0; i < this.array.length; i++) {
                    if (i in this.array) size++;
                }
                return size;
            }
        },
        length: {
            get: function () {
                return this.array.length;
            },
            set: function (value) {
                return this.array.length = value;
            }
        }
    });
The obvious limitation is that you can’t access elements of MyArray via square brackets, you must use methods to do so:
    MyArray.prototype.get = function (index) {
        return this.array[index];
    }
    MyArray.prototype.set = function (index, value) {
        return this.array[index] = value;
    }
Normal methods of Array.prototype can be transferred via the following bit of meta-programming.
    [ "toString", "push", "pop" ].forEach(function (name) {
        MyArray.prototype[name] = function () {
            return Array.prototype[name].apply(this.array, arguments);
        }
    });
Using MyArray:
    > var a = new MyArray("a", "b");
    > a.length = 4;
    > a.push("c")
    5
    > a.length
    5
    > a.size
    3
    > a.set(0, "x");
    > a.toString()
    'x,b,,,c'

Subclassing builtins via __proto__ and in ECMAScript 6

A follow-up blog post describes three more solutions for subclassing builtins:
  • Via the special property __proto__ that is supported by more and more browsers and will most probably be part of ECMAScript 6.
  • Via canonical subclassing patterns: which work for builtins in ECMAScript 6, thanks to changes that have been made.
  • Via class definitions, where you can extend any builtin.

References

  1. ECMAScript 5.1 specification, Chap. 15 [details on builtins: instance properties etc.]
  2. Prototypes as classes – an introduction to JavaScript inheritance
  3. Extending JavaScript Arrays While Keeping Native Bracket-Notation Functionality” by Ben Nadel