Integers and shift operators in JavaScript

[2012-02-20] bitwise_ops, numbers, dev, javascript, jsint, jslang

JavaScript only has floating point numbers. This post explains how integer operations are handled, specifically the bit shift operations. It will also answer the question whether n >>> 0 is a good way of converting a number to a non-negative integer.

Preparations

As we’ll be looking at binary numbers frequently, we define the following methods to help us:

    String.prototype.bin = function () {
        return parseInt(this, 2);
    };
    Number.prototype.bin = function () {
        var sign = (this < 0 ? "-" : "");
        var result = Math.abs(this).toString(2);
        while(result.length < 32) {
            result = "0" + result;
        }
        return sign + result;
    }

The methods in use:

    > "110".bin()
    6
    > 6..bin()
    '00000000000000000000000000000110'

Integers in JavaScript

All integer operations (such as any kind of bitwise manipulation) follow the same pattern: Convert the operands to floating point numbers, then to integers; perform the computation; convert the integer result back to a floating point number. There are four integer types used internally:

Integer: a value in the range [−2⁵³, +2⁵³]. Used for: most integer arguments (indices, dates, etc.). Higher and lower integers can be represented, but only the integers in the interval are contiguous [1].
Uint16: 16 bit unsigned integers in the range [0, 2¹⁶−1]. Used for: character codes.
Uint32: 32 bit unsigned integers in the range [0, 2³²−1]. Used for: array lengths.
Int32: 32 bit signed integers in the range [−2³¹, 2³¹−1]. Used for: bitwise not, binary bitwise operators, unsigned shift.

Converting numbers to Integer

A number n is converted to integer via the following formula:

sign(n) ⋅ floor(abs(n))

Intuitively, one removes all decimals. The trick with sign and abs is necessary, because floor converts a floating point number to the next lower integer:

    > Math.floor(3.2)
    3
    > Math.floor(-3.2)
    -4

Conversion to integer can be implemented without a sign function as follows:

    function ToInteger(x) {
        x = Number(x);
        return x < 0 ? Math.ceil(x) : Math.floor(x);
    }

We deviate from the normal practice of letting (non-constructor) functions start with a lowercase letter to conform with the names used by the ECMAScript 5.1 specification.

Converting numbers to Uint32

As a first step, convert the number to integer. If it is within the range of Uint32, we are done. If it isn’t (for example, because it is negative) then we apply modulo 2³² to it. Note that, unlike JavaScript’s remainder operator %, the result of the modulo operator has the sign of the second operand. Hence, modulo 2³² is always positive. Intuitively, one adds or subtracts 2³² until the number is in the range [0, 2³²−1]. The following is an implementation of ToUint32.

    function modulo(a, b) {
        return a - Math.floor(a/b)*b;
    }
    function ToUint32(x) {
        return modulo(ToInteger(x), Math.pow(2, 32));
    }

The modulo operation becomes apparent at numbers close to integer multiples of 2³².

    > ToUint32(Math.pow(2,32))
    0
    > ToUint32(Math.pow(2,32)+1)
    1

    > ToUint32(-Math.pow(2,32))
    0
    > ToUint32(-Math.pow(2,32)-1)
    4294967295
    
    > ToUint32(-1)
    4294967295

The result of converting a negative number makes more sense if we look at its binary representation. To negate a binary number, you invert each of its digits and then add one. The inversion is called the ones’ complement, adding one turns it into the twos’ complement. Illustrating the process with 4 digits:

    0001   1
    1110   ones’ complement of 1
    1111   −1, twos’ complement of 1
   10000   −1 + 1

The last line explains why the twos’ complement really is a negative number if the digits are fixed: The result of adding 1 to 1111 is 0, ignoring the fifth digit. ToUint32 produces the twos’ complement for 32 bits:

    > ToUint32(-1).bin()
    '11111111111111111111111111111111'

Converting numbers to Int32

To convert a number to Int32, you first convert it to Uint32. If its highest bit is set (if it is greater or equal to 2³¹) then 2³² is subtracted to turn it into a negative number (2³² is 4294967295+1).

    function ToInt32(x) {
        var uint32 = ToUint32(x);
        if (uint32 >= Math.pow(2, 31)) {
            return uint32 - Math.pow(2, 32)
        } else {
            return uint32;
        }
    }

Results:

    > ToInt32(-1)
    -1
    > ToInt32(4294967295)
    -1

The shift operators

JavaScript has three shift operators:

Signed shift left <<
Signed shift right >>
Unsigned shift right >>>

Signed right shift

Signed right shifts by x digits are the same as dividing by 2^x.

    > -4 >> 1
    -2
    > 4 >> 1
    2

At a binary level, we see how the digits are shifted right while the highest digit stays the same.

    > ("10000000000000000000000000000010".bin() >> 1).bin()
    '11000000000000000000000000000001'

Unsigned shift right

Unsigned right-shifting is simple: just move the bits, while shifting in zeros from the left.

    > ("10000000000000000000000000000010".bin() >>> 1).bin()
    '01000000000000000000000000000001'

The sign is not preserved, the result is always a Uint32:

    > -4 >>> 1
    2147483646

Left shift

Left shifts by x digits are the same as multiplying by 2^x.

    > 4 << 1
    8
    > -4 << 1
    -8

For left shifts, signed and unsigned operation are indistinguishable.

    > ("10000000000000000000000000000010".bin() << 1).bin()
    '00000000000000000000000000000100'

To see why, we again turn to 4 digit binary numbers and single-digit shifts. A signed left shift means that if the highest digit is 1 before the shift, it is also 1 after the shift. If there was a number where we could observe the difference between signed and unsigned left shifts, then its second-highest digit would have to be 0 (otherwise the highest digit would be 1 in either case). That is, it would have to look like this:

    10__

The result of an unsigned left shift is 0__0. For a signed shift by one digit, we would assume that it tried to maintain the negative sign and thus leave the highest digit at 1. Given that such a shift should be a multiplication by 2, we fail for the example of shifting 1001 (−7) to 1010 (−6).

Another way to look at it is that for negative numbers, the highest digit is 1. The lower the remaining digits are, the lower the number is. For example, the lowest 4-digit negative number is

    1000   (−8, the twos’ complement of itself)

Any number 10__ is -5 or lower (-5, -6, -7, -8). But multiplying either one of those numbers by 2 would put it out of range. Therefore, a signed shift makes no sense.

Alternate implementations of ToUint32 and ToInt32

Unsigned shifts convert their left-hand side to Uint32, signed shifts convert it to Int32. Shifting by 0 digits returns the converted value.

    function ToUint32(x) {
        return x >>> 0;
    }
    function ToInt32(x) {
        return x >> 0;
    }

Lessons learned

Will you ever need to perform one of the three conversions ToInteger, ToUint32, ToInt32 shown here? Of the three, only ToInteger is useful for normal applications. But you do have other options for converting to integer:

Math.floor() converts its argument to the closest lower integer.
```
    > Math.floor(3.8)
    3
    > Math.floor(-3.8)
    -4
```
Math.ceil() converts its argument to the closest higher integer.
```
    > Math.ceil(3.2)
    4
    > Math.ceil(-3.2)
    -3
```

Math.round() converts its argument to the closest integer. Examples:

    > Math.round(3.2)
    3
    > Math.round(3.5)
    4
    > Math.round(3.8)
    4

The result of rounding -3.5 is slightly unexpected.

    > Math.round(-3.2)
    -3
    > Math.round(-3.5)
    -3
    > Math.round(-3.8)
    -4

Therefore, Math.round(x) is the same as

    Math.ceil(x + 0.5)

Avoid parseInt() which delivers the expected results, but does so by converting its argument to string and then parsing any prefix that is a valid integer.

The modulo operations performed by ToUint32 and ToInt32 are rarely useful in practice. The following equivalent to ToUint32 is sometimes used to convert any value to a non-negative integer.

    value >>> 0

That is a lot of magic for a single expression! It is usually better to split that up into several statements or expressions. You might even want to throw an exception if a value is below 0 or not a number. That would also avoid some of the more unexpected results of the >>> operator:

    > -1 >>> 0
    4294967295

Reference

How numbers are encoded in JavaScript