Subclassing builtins in ECMAScript 6

[2013-03-18] esnext, dev, javascript
(Ad, please don’t block)
[Update 2015-02-15] Newer version of this blog post: “Classes in ECMAScript 6 (final semantics)

In JavaScript, it is difficult to create sub-constructors of built-in constructors such as Array. This blog post explains the problem and possible solutions – including one that will probably be chosen by ECMAScript 6. The post is based on Allen Wirfs-Brock’s slides from a presentation he held on January 29, during a TC39 meeting.

The problem

Creating sub-constructors of built-in constructors is difficult to impossible. The normal pattern for subclassing in JavaScript is (ignoring the property constructor [1]):
    function SuperConstr(arg1) {
        ...
    }
    function SubConstr(arg1, arg2) {
        SuperConstr.call(this, arg1);
    }
    SubConstr.prototype = Object.create(SuperConstr.prototype);    
If you want to use this pattern with built-in constructors, you are facing problems. The following example demonstrates those problems for Array; they are similar for other builtins.

Trying to subclass Array

Take, for example, the following code, where we try to subclass Array.
    function MyArray(len) {
        Array.call(this, len);
    }
    MyArray.prototype = Object.create(Array.prototype);    
This doesn’t work:
    > var a = new MyArray();
    > a[2] = 'abc';
    > a.length
    0
Compare: a normal array.
    > var b = [];
    > b[2] = 'abc';
    > b.length
    3
If you invoke a constructor C via new C(arg1, arg2, ...), two steps happen (in the internal [[Construct]] method that every function has):
  1. Allocation: create an instance inst, an object whose prototype is C.prototype (if that value is not an object, use Object.prototype).
  2. Initialization: Initialize inst via C.call(inst, arg1, arg2, ...). If the result of that call is an object, return it. Otherwise, return inst.
There are obstacles to both steps when you try to subclass Array.

Allocation obstacle: MyArray allocates the wrong kind of object

Array instances are special – the ECMAScript 6 specification calls them exotic. Their handling of the property length can’t be replicated via normal JavaScript. If you invoke the constructor MyArray then an instance of MyArray is created, not an exotic object.

Initialization obstacle: MyArray can’t use Array for initialization

It is impossible to hand an existing object to Array via this – it completely ignores its this and always creates a new instance.
    > var a = [];
    > var b = Array.call(a, 3);
    > a === b   // b is a new object
    false
    > b.length
    3
    > a.length   // a is unchanged
    0
Thus, the Array.call(...) in the first line of MyArray does not work.

Solutions

Solution: __proto__

On JavaScript engines that support __proto__ [2], you can do the following:
    function MyArray(len) {
        var inst = new Array(len);
        inst.__proto__ = MyArray.prototype;
        return inst;
    }
    MyArray.prototype = Object.create(Array.prototype);    
Apart from changing the prototype of an existing object being a relatively costly operation, the biggest disadvantage of this solution is that you can’t subclass MyArray in a normal manner, either.

This is the only solution that works in current browsers (that support __proto__).

Non-solution: constructors make objects exotic

One could change the Array constructor so that it makes objects that are passed to it exotic. But then one faces difficulties: Some exotic objects have a special structure that you can’t add to an object after the fact. And it would allow one to add several exotic features to the same object (e.g. by first calling Array and then Date), which could lead to conflicts and other problems.

ECMAScript 6 solution: decouple allocation and initialization

Specification-wise, the new operator invokes the internal [[Construct]] method, which roughly looks like this:
    Function.prototype.[[Construct]] = function (...args) {
        let Constr = this;

        // Allocation
        let inst = Object.create(Constr.prototype);

        // Initialization
        let result = Constr.apply(inst, args);
        if (result !== null && typeof result === 'object') {
            return result;
        } else {
            return inst;
        }
    }
Array overrides this method to allocate an exotic object.

Eliminating the allocation obstacle. In a subclass of Array, we’d like to reuse method [[Construct]] of Array. In ECMAScript 5, we can’t, because the prototype of a constructor is always Function.prototype and never its super-constructor. That is, it doesn’t inherit [[Construct]] from its super-constructor. However, in ECMAScript 6, constructor inheritance is the default.

Additionally, Wirfs-Brock proposes to handle allocation in a separate, publicly accessible method whose key is the well-known symbol @@create (that can be imported from some module). Array would only override that method and default [[Construct]] would look like this for all constructors:

    Function.prototype.[[Construct]] = function (...args) {
        let Constr = this;

        // Allocation
        let create = Constr[@@create] || Object[@@create];
        let inst = create();

        // Initialization
        let result = Constr.apply(inst, args);
        if (result !== null && typeof result === 'object') {
            return result;
        } else {
            return inst;
        }
    }
Many builtins would have custom @@create methods: Array, String, Boolean, Number, Date, RegExp, Map, Set, Weakmap, ArrayBuffer, ... These builtins have exotic instances and/or custom values for the internal property [[Class]] [3].

Eliminating the initialization obstacle. Sub-constructors need to be able to use Array for initialization. Thus, it needs to distinguish two different kinds of invocation:

  • Used as a function: create a new instance.
  • Used for initialization (via new or via a super-call): set up this.
This is more tricky than it seems (and not something you should do for your own constructors). You might try the following:
    function Foo() {
        'use strict';
        if (this === undefined) {  // (*)
            // Invoked as a function
        } else {
            // Invoked for initialization
        }
    }
However, this fails if you put Foo into a namespace object:
    var namespace = {};
    namespace.Foo = Foo;
    namespace.Foo();   // not treated as a function call
In line (*), you need to ensure that this is not an instance of Foo:
    if (this === undefined || !(this instanceof Foo)) ...
You could also trigger the “function” case for instances of Foo that have already been initialized.

This solution will probably be adopted by ECMAScript 6. Its complexity will be largely hidden: You can either use the canonical way of subclassing shown above or you can use a class definition [4]:

    class MyArray extends Array {
        ...
    }

When new does not initialize

The following problem is independent of the “allocation versus initialization” problem mentioned in Sect. 2.3: Some constructors, even when invoked directly via new, don’t initialize the instance that has been created, they throw it away. The following subsections describe when that happens.

Factory constructors

Use case. A factory constructor is “abstract”: it examines its arguments and, depending on their values, invokes one of its sub-constructors. Many class-based languages use static factory methods for this purpose.

Solution. You need to distinguish whether you are called directly via new or via a sub-constructor. It is conceivable to add language support for this. An alternative is to have a parameter calledFromSubConstructor whose default value is false. Sub-constructor set it to true. If it is true, you initialize. Otherwise, you return the result of a sub-constructor.

Cached instances

This use case is very similar to factory constructors. For example, constructor might put every instance it creates in a cache. If it is told to create an instance that is similar to one in the cache, it returns the cached one, instead. The solution is the same as for factory constructors. However, if you subclass, more work is probably needed, especially if the subclass has a different notion of similarity.

Returning an argument

Some constructors return their argument if it fulfills certain criteria. For example, Object:
    > var obj = {};
    > new Object(obj) === obj
    true
Again, this can be solved in the manner described in Sect. 3.1.

A few observations

The constructor as a method

In ECMAScript 6, the role of the constructor as a method that initializes an object increases. Take, for example, the following line from the [[Construct]] method:
    let result = Constr.apply(inst, args);
It is equivalent to:
    let result = inst.constructor(...args);
Furthermore, sub-constructors call super-constructors to help them with initialization. For example:
    function SubConstr(arg1, arg2) {
        super.constructor(arg1);  // or: super(arg1)
    }
Lastly, even though a class is internally translated to a constructor, the actual constructor body is inside a method:
    class MyClass {
        constructor(...) {
            ...
        }
        ...
    }

JavaScript becomes less prototypal

With constructors handling allocation via @@create, you can’t use the prototype to create an instance, any more:
    let inst = Object.create(MyConstr.proto);
    inst.constructor(arg1, arg2);
Instead, you have to do the following.
    let inst = MyConstr[@@create]();
    inst.constructor(arg1, arg2);

A method for post-initialization?

One could introduce @@postConstruct, a method that is invoked after all constructors have been executed. It is the inverse of @@create. Use case for this method: freeze an instance or make it non-extensible.

How about ECMAScript 5?

There are a few tricks you can use to subclass builtins in ECMAScript 5 [5].

Acknowledgement

Thanks to Allen Wirfs-Brock for answering my questions about his slides.

References

  1. What’s up with the “constructor” property in JavaScript?
  2. JavaScript: __proto__
  3. Categorizing values in JavaScript
  4. ECMAScript 6: classes
  5. Subclassing builtins in ECMAScript 5