12.08.13

Micro-feature from ES6, now in Firefox Aurora and Nightly: binary and octal numbers

A couple years ago when SpiderMonkey’s implementation of strict mode was completed, I observed that strict mode forbids octal number syntax. There was some evidence that novice programmers used leading zeroes as alignment devices, leading to unexpected results:

var sum = 015 + // === 13, not 15!
          197;
// sum === 210, not 212

But some users (Mozilla extensions and server-side node.js packages in particular) still want octal syntax, usually for file permissions. ES6 thus adds new octal syntax that won’t trip up novices. Hexadecimal numbers are formed with the prefix 0x or 0X followed by hexadecimal digits. Octal numbers are similarly formed using 0o or 0O followed by octal digits:

var DEFAULT_PERMS = 0o644; // kosher anywhere, including strict mode code

(Yes, it was intentional to allow the 0O prefix [zero followed by a capital O] despite its total unreadability. Consistency trumped readability in TC39, as I learned when questioning the wisdom of 0O as prefix. I think that decision is debatable, and the alternative is certainly not “nanny language design”. But I don’t much care as long as I never see it. 🙂 I recommend never using the capital version and applying a cluestick to anyone who does.)

Some developers also need binary syntax, which ECMAScript has never provided. ES6 thus adds analogous binary syntax using the letter b (lowercase or uppercase):

var FLT_SIGNBIT  = 0b10000000000000000000000000000000;
var FLT_EXPONENT = 0b01111111100000000000000000000000;
var FLT_MANTISSA = 0b00000000011111111111111111111111;

Try out both new syntaxes in Firefox Aurora or, if you’re feeling adventurous, in a Firefox nightly. Use the profile manager if you don’t want your regular Firefox browsing history touched.

If you’ve ever needed octal or binary numbers, hopefully these additions will brighten your day a little. 🙂

26.02.11

The proper way to call parseInt (tl;dr: parseInt(str, radix))

Introduction

Allen Wirfs-Brock recently discussed the impedance mismatch when functions accepting optional arguments are incompatibly combined, considering in particular combining parseInt and Array.prototype.map. In doing so he makes this comment:

The most common usage of parseInt passes only a single argument

Code-quality systems like JSLint routinely warn about parseInt without an explicit radix. Most uses might well pass only a single argument, but I could easily imagine this going the other way.

This raises an interesting question: why do lints warn about using parseInt without a radix?

parseInt and radixes

Like much of JavaScript, parseInt tries to be helpful when called without an explicitly specified radix. It attempts to guess a suitable radix:

assertEq(parseInt("+17"), 17);
assertEq(parseInt("42"), 42);
assertEq(parseInt("-0x42"), -66);
// assertEq(parseInt("0755"), ???); // we'll get back to this

If the string (after optional leading whitespace and + or -) starts with a non-zero decimal digit, it’s parsed as decimal. But if the string begins with 0, things get wacky. If the next character is x or X, the number is parsed in base-16: hexadecimal. Last, if the next character isn’t x or X…hold that thought. I’ll return to it in a moment.

Thus the behavior of parseInt without a radix depends not just on the numeric contents of the string but also upon its internal structure, entirely separate from its contents. This alone is reason enough to always specify a radix: specify a radix 2 ≤ r ≤ 36 and it will be used, no guessing, no uncertain behavior in the face of varying strings. (Although, to be sure, there’s still a very slight wrinkle: if r === 16 and your string begins with 0x or 0X, they’ll be skipped when determining the integer to return. But this is a pretty far-out edge case where you might want to parse a hexadecimal string without a prefix and would also want to process one with a prefix as just 0.)

But wait! There’s more

Beyond cuteness lies another concern. Let’s return to the leading-zero-but-not-hexadecimal case:

parseInt("0755");

In some programming languages a leading zero (that’s not part of a hexadecimal prefix) means the number is base-8: octal. So maybe JavaScript infers this to be an octal number, returning 7 × 8 × 8 + 5 × 8 + 5 === 493.

On the other hand, as I noted in Mozilla’s ES5 strict mode documentation, there’s some evidence that people use leading zeroes as alignment devices, thinking they have no semantic effect. So maybe leading zero is decimal instead.

The wonderful thing about standards is that there are so many of them to choose from

According to ES3, a leading zero with no explicit radix is either decimal or, if octal extensions have been implemented, octal. So what happens depends on who wrote the ES3 implementation, and what choice they made. But what if it’s an ES5 implementation? ES5 explicitly forbids octal and says this is interpreted as decimal. Therefore, parseInt("0755") is 755 in bog-standard ES3 implementations, 493 in ES3 implementations which have implemented octal extensions, and 755 in conforming ES5 implementations. Isn’t it great?

What do browsers actually do?

On the web everyone implements the octal extensions, so you’ll have to look hard to find an ES3 browser that doesn’t make parseInt("0755") === 493. But ES3 is old and busted, and ES5 is the new hotness. What do ES5 implementations do, especially as the change in ES5 isn’t backwards-compatible?

Surprisingly browsers aren’t all playing chicken here, waiting to see that they can change without breaking sites. On this particular point IE9 leads the way (in standards mode code only), implementing parseInt("0755") === 755 despite having implemented parseInt("0755") === 493 in the past. Before I saw IE9 implemented this (although I hasten to note they have not shipped a release with it yet), I expected no browser would implement it due to the possibility of breaking sites. After seeing IE9’s example, I’m less certain. Hopefully their experience will shed light on the decision for the other browser vendors.

Conclusion

Precise details of browser and specification inconsistencies aside, the point remains that parseInt(str) tries to be cute when parsing str. That cuteness can make parseInt(str) unpredictable if inputs vary sufficiently. Avoid edge-case bugs and use parseInt(str, radix) instead.