20.06.15

New changes to make SpiderMonkey’s (and Firefox’s) parsing of destructuring patterns more spec-compliant

Destructuring in JavaScript

One new feature in JavaScript, introduced in ECMAScript 6 (formally ECMAScript 2015, but it’ll always be ES6 in our hearts), is destructuring. Destructuring is syntactic sugar for assigning sub-values within a single value — nested properties, iteration results, &c., to arbitrary depths — to a set of locations (names, properties, &c.).

// Declarations
var [a, b] = [1, 2]; // a = 1, b = 2
var { x: c, y: d } = { x: 42, y: 17 }; // c = 42, d = 17

function f([z]) { return z; }
print(f([8675309])); // 8675309


// Assignments
[b, f.prop] = [3, 15]; // b = 3, f.prop = 15
({ p: d } = { p: 33 }); // d = 33

function Point(x, y) { this.x = x; this.y = y; }

// Nesting's copacetic, too.
// a = 2, b = 4, c = 8, d = 16
[{ x: a, y: b }, [c, d]] = [new Point(2, 4), [8, 16]];

Ambiguities in the parsing of destructuring

One wrinkle to destructuring is its ambiguity: reading start to finish, is a “destructuring pattern” instead a literal? Until any succeeding = is observed, it’s impossible to know. And for object destructuring patterns, could the “pattern” just be a block statement? (A block statement is a list of statements inside {}, e.g. many loop bodies.)

How ES6 handles the potential parser ambiguities in destructuring

ES6 says an apparent “pattern” could be any of these possibilities: the only way to know is to completely parse the expression/statement. There are more elegant and less elegant ways to do this, although in the end they amount to the same thing.

Object destructuring patterns present somewhat less ambiguity than array patterns. In expression context, { may begin an object literal or an object destructuring pattern (just as [ does for arrays, mutatis mutandis). But in statement context, { since the dawn of JavaScript only begins a block statement, never an object literal — and now, never an object destructuring pattern.

How then to write object destructuring pattern assignments not in expression context? For some time SpiderMonkey has allowed destructuring patterns to be parenthesized, incidentally eliminating this ambiguity. But ES6 chose another path. In ES6 destructuring patterns must not be parenthesized, at any level of nesting within the pattern. And in declarative destructuring patterns (but not in destructuring assignments), declaration names also must not be parenthesized.

SpiderMonkey now adheres to ES6 in requiring no parentheses around destructuring patterns

As of several hours ago on mozilla-inbound, SpiderMonkey conforms to ES6’s parsing requirements for destructuring, with respect to parenthesization. These examples are all now syntax errors:

// Declarations
var [(a)] = [1]; // BAD, a parenthesized
var { x: (c) } = {}; // BAD, c parenthesized
var { o: ({ p: p }) } = { o: { p: 2 } }; // BAD, nested pattern parenthesized

function f([(z)]) { return z; } // BAD, z parenthesized

// Top level
({ p: a }) = { p: 42 }; // BAD, pattern parenthesized
([a]) = [5]; // BAD, pattern parenthesized

// Nested
[({ p: a }), { x: c }] = [{}, {}]; // BAD, nested pattern parenthesized

Non-array/object patterns in destructuring assignments, outside of declarations, can still be parenthesized:

// Assignments
[(b)] = [3]; // OK: parentheses allowed around non-pattern in a non-declaration assignment
({ p: (d) } = {}); // OK: ditto
[(parseInt.prop)] = [3]; // OK: parseInt.prop not a pattern, assigns parseInt.prop = 3

Conclusion

These changes shouldn’t much disrupt anyone writing JS. Parentheses around array patterns are unnecessary and are easily removed. For object patterns, instead of parenthesizing the object pattern, parenthesize the whole assignment. No big deal!

// Assignments
([b]) = [3]; // BAD: parentheses around array pattern
[b] = [3]; // GOOD

({ p: d }) = { p: 2 }; // BAD: parentheses around object pattern
({ p: d } = { p: 2 }); // GOOD

One step forward for SpiderMonkey standards compliance!

12.08.13

Micro-feature from ES6, now in Firefox Aurora and Nightly: binary and octal numbers

A couple years ago when SpiderMonkey’s implementation of strict mode was completed, I observed that strict mode forbids octal number syntax. There was some evidence that novice programmers used leading zeroes as alignment devices, leading to unexpected results:

var sum = 015 + // === 13, not 15!
          197;
// sum === 210, not 212

But some users (Mozilla extensions and server-side node.js packages in particular) still want octal syntax, usually for file permissions. ES6 thus adds new octal syntax that won’t trip up novices. Hexadecimal numbers are formed with the prefix 0x or 0X followed by hexadecimal digits. Octal numbers are similarly formed using 0o or 0O followed by octal digits:

var DEFAULT_PERMS = 0o644; // kosher anywhere, including strict mode code

(Yes, it was intentional to allow the 0O prefix [zero followed by a capital O] despite its total unreadability. Consistency trumped readability in TC39, as I learned when questioning the wisdom of 0O as prefix. I think that decision is debatable, and the alternative is certainly not “nanny language design”. But I don’t much care as long as I never see it. 🙂 I recommend never using the capital version and applying a cluestick to anyone who does.)

Some developers also need binary syntax, which ECMAScript has never provided. ES6 thus adds analogous binary syntax using the letter b (lowercase or uppercase):

var FLT_SIGNBIT  = 0b10000000000000000000000000000000;
var FLT_EXPONENT = 0b01111111100000000000000000000000;
var FLT_MANTISSA = 0b00000000011111111111111111111111;

Try out both new syntaxes in Firefox Aurora or, if you’re feeling adventurous, in a Firefox nightly. Use the profile manager if you don’t want your regular Firefox browsing history touched.

If you’ve ever needed octal or binary numbers, hopefully these additions will brighten your day a little. 🙂

05.08.13

New in Firefox 23: the length property of an array can be made non-writable (but you shouldn’t do it)

Properties and their attributes

Properties of JavaScript objects include attributes for enumerability (whether the property shows up in a for-in loop on the object) and configurability (whether the property can be deleted or changed in certain ways). Getter/setter properties also include get and set attributes storing those functions, and value properties include attributes for writability and value.

Array properties’ attributes

Arrays are objects; array properties are structurally identical to properties of all other objects. But arrays have long-standing, unusual behavior concerning their elements and their lengths. These oddities cause array properties to look like other properties but behave quite differently.

The length property

The length property of an array looks like a data property but when set acts like an accessor property.

var arr = [0, 1, 2, 3];
var desc = Object.getOwnPropertyDescriptor(arr, "length");
print(desc.value); // 4
print(desc.writable); // true
print("get" in desc); // false
print("set" in desc); // false

print("0" in arr); // true
arr.length = 0;
print("0" in arr); // false (!)

In ES5 terms, the length property is a data property. But arrays have a special [[DefineOwnProperty]] hook, invoked whenever a property is added, set, or modified, that imposes special behavior on array length changes.

The element properties of arrays

Arrays’ [[DefineOwnProperty]] also imposes special behavior on array elements. Array elements also look like data properties, but if you add an element beyond the length, it’s as if a setter were called — the length grows to accommodate the element.

var arr = [0, 1, 2, 3];
var desc = Object.getOwnPropertyDescriptor(arr, "0");
print(desc.value); // 0
print(desc.writable); // true
print("get" in desc); // false
print("set" in desc); // false

print(arr.length); // 4
arr[7] = 0;
print(arr.length); // 8 (!)

Arrays are unlike any other objects, and so JS array implementations are highly customized. These customizations allow the length and elements to act as specified when modified. They also make array element access about as fast as array element accesses in languages like C++.

Object.defineProperty implementations and arrays

Customized array representations complicate Object.defineProperty. Defining array elements isn’t a problem, as increasing the length for added elements is long-standing behavior. But defining array length is problematic: if the length can be made non-writable, every place that modifies array elements must respect that.

Most engines’ initial Object.defineProperty implementations didn’t correctly support redefining array lengths. Providing a usable implementation for non-array objects was top priority; array support was secondary. SpiderMonkey’s initial Object.defineProperty implementation threw a TypeError when redefining length, stating this was “not currently supported”. Fully-correct behavior required changes to our object representation.

Earlier this year, Brian Hackett’s work in bug 827490 changed our object representation enough to implement length redefinition. I fixed bug 858381 in April to make Object.defineProperty work for array lengths. Those changes will be in Firefox 23 tomorrow.

Should you make array lengths non-writable?

You can change an array’s length without redefining it, so the only new capability is making an array length non-writable. Compatibility aside, should you make array lengths non-writable? I don’t think so.

Non-writable array length forbids certain operations:

  • You can’t change the length.
  • You can’t add an element past that length.
  • You can’t call methods that increase (e.g. Array.prototype.push) or decrease (e.g. Array.prototype.pop) the length. (These methods do sometimes modify the array, in well-specified ways that don’t change the length, before throwing.)

But these are purely restrictions. Any operation that succeeds on an array with non-writable length, succeeds on the same array with writable length. You wouldn’t do any of these things anyway, to an array whose length you’re treating as fixed. So why mark it non-writable at all? There’s no functionality-based reason for good code to have non-writable array lengths.

Fixed-length arrays’ only value is in maybe permitting optimizations dependent on immutable length. Making length immutable permits minimizing array elements’ memory use. (Arrays usually over-allocate memory to avoid O(n2) behavior when repeatedly extending the array.) But if it saves memory (this is highly allocator-sensitive), it won’t save much. Fixed-length arrays may permit bounds-check elimination in very circumscribed situations. But these are micro-optimizations you’d be hard-pressed to notice in practice.

In conclusion: I don’t think you should use non-writable array lengths. They’re required by ES5, so we’ll support them. But there’s no good reason to use them.

06.06.11

I feel the need…the need for JSON parsing correctness and speed!

JSON and SpiderMonkey

JSON is a handy serialization format for passing data between servers and browsers and between independent, cooperating web pages. It’s increasingly the format of choice for website APIs.

ECMAScript 5 (the standard underlying JavaScript) includes built-in support for producing and parsing JSON. SpiderMonkey has included such support since before ES5 added it.

SpiderMonkey’s support, because it predated ES5, hasn’t always agreed with ES5. Also, because JSON support was added before it became ubiquitous on the web, it wasn’t written with raw speed in mind.

Improving JSON.parse

We’ve now improved JSON parsing in Firefox 5 to be fast and fully conformant with ES5. For awhile we’ve made improvements to JSON by piecemeal change. This worked for small bug fixes, and it probably would have worked to fix the remaining conformance bugs. But performance is different: to improve performance we needed to parse in a fundamentally different way. It was time for a rewrite.

What parsing bugs got fixed?

The bugs the new parser fixes are quite small and generally shouldn’t affect sites, in part because other browsers overwhelmingly don’t have these bugs. We’ve had no compatibility reports for these fixes in the month and a half they’ve been in the tree:

  • The number syntax is properly stricter:
    • Octal numbers are now syntax errors.
    • Numbers containing a decimal point must now include a fractional component (i.e. 1. is no longer accepted).
  • JSON.parse("this") now throws a SyntaxError rather than evaluate to true, due to a mistake reusing our keyword parser. (Hysterically, because we used our JSON parser to optimize eval in certain cases, this change means that eval("(this)") will no longer evaluate to true.)
  • Strings can’t contain tab characters: JSON.parse('"\t"') now properly throws a SyntaxError.

This list of changes should be complete, but it’s possible I’ve missed others. Parsing might be a solved problem in the compiler literature, but it’s still pretty complicated. I could have missed lurking bugs in the old parser, and it’s possible (although I think less likely) that I’ve introduced bugs in the new parser.

What about speed?

The new parser is much faster than the old one. Exactly how fast depends on the data you’re parsing. For example, on Opera’s simple parse test, I get around 156000 times/second in Firefox 4, but in Firefox 5 with the new JSON parser I get around 339000 times/second (bigger is better). On a second testcase, Kraken’s JSON.parse test (json-parse-financial, to be precise), I get a 4.0 time of around 140ms and a 5.0 time of around 100ms (smaller is better). (In both cases I’m comparing builds containing far more JavaScript changes than just the new parser, to be sure. But I’m pretty sure the bulk of the performance improvements in these two cases are due to the new parser.) The new JSON parser puts us solidly in the center of the browser pack.

It’ll only get better in the future as we wring even more speed out of SpiderMonkey. After all, on the same system used to generate the above numbers, IE gets around 510000 times/second. I expect further speedup will happen during more generalized performance improvements: improving the speed of defining new properties, improving the speed with which objects are allocated, improving the speed of creating a property name from a string, and so on. As we perform such streamlining, we’ll parse JSON even faster.

Side benefit: better error messages

The parser rewrite also gives JSON.parse better error messages. With the old parser it would have been difficult to provide useful feedback, but in the new parser it’s easy to briefly describe the reason for syntax errors.

js> JSON.parse('{ foo: 17 }'); // unquoted property name
(old) typein:1: SyntaxError: JSON.parse
(new) typein:1: SyntaxError: JSON.parse: expected property name or '}'

We can definitely do more here, perhaps by including context for the error from the provided string, but this is nevertheless a marked improvement over the old parser’s error messages.

Bottom line

JSON.parse in Firefox 5 is faster, follows the spec, and tells you what went wrong if you give it bad data. ’nuff said.

08.09.10

SpiderMonkey JSON change: trailing commas no longer accepted

Historically, SpiderMonkey’s JSON implementation has accepted input containing trailing commas:

JSON.parse('[1, 2, 3, ]');
JSON.parse('{ "1": 2, }");

We did so because the JSON RFC permitted implementations to accept extensions, and trailing commas are nice for humans to be able to use. The down side is that accepting extra syntax like this makes interoperability harder: implementations which don’t implement the same extension, for reasons every bit as valid as those of implementations allowing the extension, are disadvantaged. ES5 weighed these concerns and chose to precisely specify permissible JSON syntax, putting everyone on the same page: trailing commas are not permitted. Therefore, the examples above should throw a SyntaxError per ES5.

SpiderMonkey has now been changed to conform to ES5 on this point: trailing commas are syntax errors. If you still need to accept trailing commas, you should use a custom implementation that accepts them — but best would be for you to adjust the processes that produce JSON strings including trailing commas to not include them. (If you are an extension or are in privileged code, for the moment you can use nsIJSON.legacyDecode to continue to accept trailing commas. However, note that we have added it only to accommodate legacy code in the process of being updated to no longer generate faulty input, and it will be removed sometime in the future.)

You can experiment with a version of Firefox with this change by downloading a TraceMonkey branch nightly; this change should also make its way into mozilla-central nightlies shortly, if you’d rather stick to trunk builds. (Don’t forget to use the profile manager if you want to keep the settings you use with your primary Firefox installation pristine.)

Older »