2010-12-28

An easy way to understand JavaScript’s prototypal inheritance

Update 2011-06-25:Prototypes as classes” is an improved version of this blog post.

This blog post explains JavaScript’s prototypal inheritance in a simple way. As it turns out, if we initially leave out constructors, then it is easy to understand. Thus, we first look at the fictional programming language ProtoScript which is JavaScript minus constructors, explain it, and then move on to constructors. As a result, you should have a solid understanding of prototypal inheritance and won’t be confused, any more, by all the JavaScript tricks out there.

Inheritance in the fictional ProtoScript

When looking for data, the search starts at this and traverses a chain of prototype objects, where each object is a prototype for its predecessor. If something cannot be found in an object, the search continues in the prototype.
Let us assume for a moment that we program in a language called ProtoScript that is exactly like JavaScript, but without constructors. That leads to a simpler version of JavaScript’s protypal inheritance that is easy to understand. Without constructors, there are two ways to create a new object in ProtoScript: one either constructs it directly (e.g. via an object literal) or copies another object. JavaScript does not have a built-in way for copying an object, but for ProtoScript and all completely prototype-based programming languages, this is an essential operation, as it replaces constructors. Just like JavaScript, ProtoScript objects contain named slots that hold either functions or values. The former kind of slot is called a method, the latter a property. The expression obj.prop looks up the value of slot prop in object obj. This lookup starts by binding this to obj and then looks for the slot in obj. If it cannot be found, the search fails with an error message, unless there is a special property called __proto__. In that case, the search continues in the object pointed to by that property. That object can again contain a __proto__ property, so we always get a prototype chain of one or more objects. While going through that chain, this stays bound to obj. The value of this is important for methods, because it allows them to look up other slot values. It is an implicit parameter. The method invocation obj.method() works similarly to reading a property: ProtoScript looks for the value of method and then tries to execute it. As an aside, the expression without parentheses, obj.method would return a function, without invoking it. Thus, the parentheses can be seen as an operator that invokes a function.

In class-based languages, whenever we have a set of similar objects, say persons with a name that can be converted to a text string, we create a class for that set. In ProtoScript, which does not have classes, we create an initial person, e.g.

    var person0 = {
        name: null,
        describe: function() {
            return "Person "+this.name;
        }
    };
All other persons are copies of person0. The only drawback of this approach is that the method describe() exists in each instance. We can avoid this kind of duplication by putting the method into the prototype. person0.describe() will execute the method in the prototype. Inside it, the expression this.name will find the correct property in person0, because this is still bound to person0. Naturally, for the above mentioned sharing to occur, copying person0 must not copy the prototype, the copy must be shallow.
The object person1 is a (modified) copy of person0. The object ProtoPerson is their prototype and shared by both.
The prototype is very similar to a class. It can also hold properties (such as a count of all persons) that are shared by all of its “instances”. Subclassing allows class-based languages to build on (to “extend”) persons when implementing a new class of objects that is similar. For example, a worker is a person that also has a (job) title. A worker’s describe() method mentions the title in addition to the name. The following code shows how ProtoScript extends persons to implement workers.
    var ProtoWorker = {
        __proto__: ProtoPerson,
        describe: function() {
            return this.title+" "+this.name;
        }
    };
    var worker0 = {
        __proto__: ProtoWorker,
        name: null,
        title: null
    };
The prototype of worker0 extends the prototype of person0. As invoked via worker0, the method describe() of ProtoWorker overrides the same method of ProtoPerson.
We use the __proto__ property to let ProtoWorker modify the behavor of ProtoPerson. ProtoWorker changes (“overrides”) the ProtoPerson’s describe() method. Whenever one searches for that method and ProtoWorker is in the prototype chain (e.g. starting in worker0), its version comes first and hides the original version. Like before, new workers are created by copying worker0. This solution is still not as elegant as it could be, because we have to repeat property name; there is no way for worker0 to extend person0. Self [1], a language that is more thoroughly prototype-based than JavaScript, avoids this repetition by allowing worker0 to point to person0, via a property similar to __proto__ (the only additional feature is that whenever the extender is copied, the extendee is copied, too—the extendee is not shared). JavaScript has a different solution that involves constructors.

To summarize: Like JavaScript, ProtoScript does not have classes, only objects. But the special property __proto__ and a clever way of sharing some objects, while copying others allow us to do everything that class-based languages can. Like in other prototype-based languages, we have the pleasure of always directly manipulating objects and there are fewer abstractions involved such as classes or instantiation. This makes prototype-based languages more intuitive for beginners. The next section explores JavaScript’s constructors.

Back to JavaScript

The following code shows how persons are implemented in JavaScript, using constructors.
    function Person(name) {
        this.name = name;
    }
    Person.prototype.describe = function() {
        return "Person "+this.name;
    };
    var person = new Person("John");
The function Person is a constructor for persons. We need to invoke it via new Person(). The operator new creates an object with a property __proto__ that points to Person.prototype and hands it to Person() as the implicit parameter this. We have thus arrived at the same result as with ProtoScript, but getting there was a bit more weird. The constructor Person combines the ProtoScript operations of defining the initial object and copying it. Note that the property __proto__ is non-standard, some browsers do not support it. But we need it to implement workers as an extension of persons, in the code below.
    function Worker(name, title) {
        Person.call(this, name);
        this.title = title;
    }
    Worker.prototype.__proto__ = Person.prototype;
    Worker.prototype.describe = function() {
        return this.title+" "+this.name;
    };
We invoke Person without new to set up the inherited properties (which we weren’t able to do in ProtoScript). If you want to invoke a method that you have changed, you invoke it via the prototype of the “super-constructor” as follows. Only the code of the method “changes”, the value of this stays the same. This is because earlier properties in a prototype chain can hide later ones and we have to avoid un-hiding them.
    Constr.prototype.mymethod = function(x, y, z) {
        SuperConstr.prototype.mymethod.apply(this, arguments);
    }
apply() is similar to call(). Its first argument is also the value for this. But where call() has one additional argument for each parameter of the method, apply() only has a single second argument with a list of values for the parameters. Thus, you can use the array-valued variable arguments.

ECMAScript 5

In ECMAScript 5, Object.create() lets you create a new object and assign a prototype at the same time. That is, you get access to __proto__ in a standard way and can create prototype chains. A function for creating an object is not a constructor any more (no operator new!), but a factory function. As a result, prototype objects can finally stand on their own instead of existing as Constructor.prototype.
    var ProtoPerson = {
        describe: function() {
            return "Person "+this.name;
        }
    };
    function createPerson(name) { // factory function
        return Object.create(ProtoPerson, {
            name: {
                value: name
            }
        });
    }
    var person = createPerson("John"); // no "new"
If you wanted to make sure that createPerson() is never invoked via new, you could check that this is undefined. If you need to extend persons, like we did with workers, the property definitions inside createPerson() have to become an externally accessible entity. A hypothetical createWorker() would then use property definitions which have been (non-destructively!) derived from that entity.

Further reading

  1. Organizing Programs Without Classes”. A great introduction to the prototype-based programming language Self and how it compares to class-based languages. This greatly helped me understand how prototypal inheritance works.
  2. Lightweight JavaScript inheritance APIs” presents APIs that help with inheritance. The post focuses on APIs that change JavaScript as little as possible.
  3. Going completely prototypal in JavaScript” shows what an API for prototypal inheritance looks like.

No comments: