2012-01-03

The pitfalls of using objects as maps in JavaScript

JavaScript is Spartan when it comes to built-in data structures. One commonly uses objects as maps from strings to values. This post points out three pitfalls when doing so.

Pitfall 1: inheritance and reading properties

When it comes to reading properties, there are two kinds of methods in JavaScript. On one hand, methods that access the whole prototype chain of an object and thus “see” inherited properties. On the other hand, methods that access only the own (direct) properties of an object and thus ignore inherited properties. As a running example, consider the object stored in the variable obj:
    var objProto = { superProp: "abc" };
    var obj = Object.create(objProto);
obj is an object with no own properties whose prototype is objProto, an object with the property superProp. obj should be interpreted as an empty map. Let’s see what operations fail to do so.

Checking whether a property exists. The in operator checks whether an object has a property with a given name, but it considers inherited properties:

    > "foo" in obj
    false // ok
    > "toString" in obj
    true  // not ok, inherited from Object.prototype
    > "superProp" in obj
    true  // not ok, inherited from objProto
Given that obj should be considered empty, we need the check to ignore inherited properties. hasOwnProperty() provides the necessary services:
    > obj.hasOwnProperty("toString")
    false
Collecting property names. How do we find out all of the keys in obj interpreted as a map? for-in looks promising:
    > for (propName in obj) console.log(propName)
    superProp
Alas, it considers inherited enumerable properties. The reason that no properties of Object.prototype show up here is that all of them are non-enumerable:
    > Object.prototype.propertyIsEnumerable("toString")
    false
Object.keys() lists only own properties.
    > Object.keys(obj)
    []
This method only returns enumerable properties. If you want to list all properties, you need to use Object.getOwnPropertyNames(). You can observe the difference by applying the methods to Object.prototype:
    > Object.keys(Object.prototype)
    []
    > Object.getOwnPropertyNames(Object.prototype)
    [ 'toString', 'hasOwnProperty', 'valueOf', ... ]
Properties foo added by assigning to obj["foo"] or obj.foo are enumerable by default.

Getting a property value. The normal way of getting properties accesses all properties:

    > obj["toString"]
    [Function: toString]
There is no built-in way in JavaScript to only read own properties, but you can easily implement one yourself:
    function getOwnProperty(obj, propName) {
        if (obj.hasOwnProperty(propName)) {
            return obj[propName];
        } else {
            return undefined;
        }
    }
With that function, the inherited property toString is ignored:
    > getOwnProperty(obj, "toString")
    undefined

Pitfall 2: overriding and invoking methods

The function getOwnProperty() invoked the method hasOwnProperty() on obj. Normally, that is fine:
    > getOwnProperty({ foo: 123 }, "foo")
    123
However, if one adds a property to obj whose name is "hasOwnProperty" then that property overrides the method Object.prototype.hasOwnProperty() and getOwnProperty() ceases to work:
    > getOwnProperty({ hasOwnProperty: 123 }, "foo")
    TypeError: Property 'hasOwnProperty' of object #<Object> is not a function
This problem can be fixed, by directly referring to hasOwnProperty(). This avoids going through obj to find it:
    function getOwnProperty(obj, propName) {
        if (Object.prototype.hasOwnProperty.call(obj, propName)) {
            return obj[propName];
        } else {
            return undefined;
        }
    }

Pitfall 3: the special property __proto__

On many JavaScript engines, the property __proto__ has special meaning: Setting it sets an object’s prototype. As we avoid inheritance, having the wrong prototype is less of an issue, but setting it is still a performance issue and will throw an error for non-object values. Hence, whenever arbitrary keys are involved, you need to escape the key "__proto__". Note that you also need to escape the escaped version of "__proto__" (etc.) to avoid clashes. That is, if you escape "__proto__" as "%__proto__" then you also need to escape "%__proto__" so that it doesn’t override a "__proto__" entry when it is used.

Mark S. Miller mentions real-world implications of this pitfall, in the email “Why we need to clean up __proto__” (which inspired this post):

Think this exercise is academic and doesn't arise in real systems? As observed at a support thread, until recently, on all non-IE browsers, if you typed "__proto__" at the beginning of a new Google Doc, your Google Doc would hang. This was tracked down to such a buggy use of an object as a string map. (To avoid such problems, Caja is shifting to using StringMap.js, which does seem safe on all platforms.)

Best practices

There are many applications for using objects as maps. If all property names are known statically (at development time) then you only need to make sure that you ignore inheritance and only look at own properties [1]. If arbitrary names can be used, you should turn to a library to avoid the pitfalls mentioned in this post. Two examples:

Related reading

  1. JavaScript properties: inheritance and enumerability
  2. Iterating over arrays and objects in JavaScript

No comments: