2011-11-08

Improving the JavaScript typeof operator

The typeof operator in JavaScript is partially broken. This blog post explains how to fix it and how to extend its use to objects.

typeof versus instanceof

typeof. Use typeof to find out whether a given value is an object or a primitive and, in the latter case, what type of primitive it is. It produces the following results:
    > typeof undefined
    'undefined'
    > typeof null // well-known bug
    'object'
    > typeof true
    'boolean'
    > typeof 123
    'number'
    > typeof "abc"
    'string'
    > typeof function() {}
    'function'
    > typeof {}
    'object'
    > typeof []
    'object'
    > typeof unknownVariable
    'undefined'
typeof null returning "object" is clearly a bug. This will probably be fixed in ECMAScript.next and the return value will then be "null".

instanceof. Use instanceof to determine whether an object is an instance of a given type. instanceof always returns false for primitive values.

    > {} instanceof Object
    true
    
    > (function() {}) instanceof Function
    true
    
    > true instanceof Object
    false

Beyond primitives

One could also argue that for an object, typeof should return the name of the object’s type. Let us look at two possible solutions.

Getting the type name: the [[Class]] property

The ECMAScript 5 specification defines the internal [[Class]] property as follows (8.6.2 “Object Internal Properties and Methods”, Table 8):
A String value indicating a specification defined classification of objects.
There is no way to access this value directly from JavaScript, however Object.prototype.toString() uses it:
    > {}.toString()
    '[object Object]'
Object, the second word in the square brackets, comes from [[Class]]. The first word is always object. While toString() is overridden in subclasses of Object, you can apply Object.prototype.toString() generically [2] to get the original functionality back:
    > [1,2,3].toString()
    '1,2,3'
    
    > Object.prototype.toString.call([1,2,3])
    '[object Array]'
    
    > /xyz/.toString()
    '/xyz/'
    
    > Object.prototype.toString.call(/xyz/)
    '[object RegExp]'
The article “Fixing the JavaScript typeof operator” (by Angus Croll) makes use of this fact to define a toType() function that uses a regular expression to extract the value of [[Class]] from between the square brackets.
    var toType = function(obj) {
      return ({}).toString.call(obj).match(/\s([a-zA-Z]+)/)[1].toLowerCase()
    }
Usage examples:
    toType({a: 4}) // "object"
    toType([1, 2, 3]) // "array"
    (function() { return toType(arguments) }()) // "arguments"
    toType(new ReferenceError()) // "error"
    toType(new Date()) // "date"
    toType(/a-z/) // "regexp"
    toType(Math) // "math"
    toType(JSON) // "json"
    toType(new Number(4)) // "number"
    toType(new String("abc")) // "string"
    toType(new Boolean(true)) // "boolean"
Problems with this approach:
  • Conflates primitives and objects: It makes sense to use capitalization to distinguish between primitives and objects. For example:
        > typeof "abc"
        'string'
        > new String("abc") instanceof String
        true
    
    The first string is a primitive, the second is an object. The former starts with a lowercase letter, the latter starts with an uppercase letter.
  • The type of new ReferenceError() should be ReferenceError, not Error. The value of [[Class]] for all error instances is always "Error":
        > {}.toString.call(new ReferenceError())
        '[object Error]'
        
        > {}.toString.call(new TypeError())
        '[object Error]'
        
        > {}.toString.call(new Error())
        '[object Error]'
    
  • How it handles arguments and special global objects is (arguably) not optimal. Is Math really an instance of Math? I would argue that these global objects are all instances of Object. To check for them, you would compare via ===:
        if (someValue === Math) { ... }
    
  • It does not work for primitives. They don’t have a [[Class]] property. If you generically invoke toString() on them, they temporarily borrow the value of their object wrapper types [3]:
        > Object.prototype.toString.call(123)
        '[object Number]'
    

Getting the type name: a hybrid approach

Let us write an alternative “improved typeof” called getTypeName(). Approach:
  • Use typeof for primitive values other than null.
  • Return "null" for null.
  • Return obj.constructor.name for any object obj (including functions!).
This is what the implementation looks like (IE fix via Adam Tybor):
    function getTypeName(value) {
        if (value === null) {
            return "null";
        }
        var t = typeof value;
        switch(t) {
            case "function":
            case "object":
                if (value.constructor) {
                    if (value.constructor.name) {
                        return value.constructor.name;
                    } else {
                        // Internet Explorer
                        // Anonymous functions are stringified as follows: 'function () {}'
                        // => the regex below does not match
                        var match = value.constructor.toString().match(/^function (.+)\(.*$/);
                        if (match) {
                            return match[1];
                        }
                    }
                }
                // fallback, for nameless constructors etc.
                return Object.prototype.toString.call(value).match(/^\[object (.+)\]$/)[1];
            default:
                return t;
        }
    }
Applied to the arguments that we previously used for toType():
    getTypeName({a: 4}) // "Object"
    getTypeName([1, 2, 3]) // "Array"
    (function() { return getTypeName(arguments) }()) // "Object"
    getTypeName(new ReferenceError()) // "ReferenceError"
    getTypeName(new Date()) // "Date"
    getTypeName(/a-z/) // "RegExp"
    getTypeName(Math) // "Object"
    getTypeName(JSON) // "Object"
    getTypeName(new Number(4)) // "Number"
    getTypeName(new String("abc")) // "String"
    getTypeName(new Boolean(true)) // "Boolean"
There is one caveat: You still need to use typeof for unknown variables.
    > getTypeName(unknownVariable)
    ReferenceError: unknownVariable is not defined
    > typeof unknownVariable
    'undefined'
Lastly, if the constructor is anonymous, the best either function can do is tell us that an instance is an object:
    > var Foo = function () {};
    > var foo = new Foo();
    > toType(foo)
    'object'
    > getTypeName(foo)
    'Object'

References

  1. JavaScript values: not everything is an object
  2. JavaScript performance: Array.prototype versus []
  3. JavaScript values: not everything is an object

No comments: