08.09.10

SpiderMonkey JSON change: trailing commas no longer accepted

Historically, SpiderMonkey’s JSON implementation has accepted input containing trailing commas:

JSON.parse('[1, 2, 3, ]');
JSON.parse('{ "1": 2, }");

We did so because the JSON RFC permitted implementations to accept extensions, and trailing commas are nice for humans to be able to use. The down side is that accepting extra syntax like this makes interoperability harder: implementations which don’t implement the same extension, for reasons every bit as valid as those of implementations allowing the extension, are disadvantaged. ES5 weighed these concerns and chose to precisely specify permissible JSON syntax, putting everyone on the same page: trailing commas are not permitted. Therefore, the examples above should throw a SyntaxError per ES5.

SpiderMonkey has now been changed to conform to ES5 on this point: trailing commas are syntax errors. If you still need to accept trailing commas, you should use a custom implementation that accepts them — but best would be for you to adjust the processes that produce JSON strings including trailing commas to not include them. (If you are an extension or are in privileged code, for the moment you can use nsIJSON.legacyDecode to continue to accept trailing commas. However, note that we have added it only to accommodate legacy code in the process of being updated to no longer generate faulty input, and it will be removed sometime in the future.)

You can experiment with a version of Firefox with this change by downloading a TraceMonkey branch nightly; this change should also make its way into mozilla-central nightlies shortly, if you’d rather stick to trunk builds. (Don’t forget to use the profile manager if you want to keep the settings you use with your primary Firefox installation pristine.)

New ES5 strict mode support: now with poison pills!

tl;dr

Don’t try to use the arguments or caller properties of functions created in strict mode code. Don’t try to use the callee or caller properties of arguments objects corresponding to invocations of functions in strict mode code. Don’t try to use the caller property of a function if it might be called from strict mode code. You are in for an unpleasant surprise (a thrown TypeError) if you do.

function strict() { "use strict"; return arguments; }
function outer() { "use strict"; return inner(); }
function inner() { return inner.caller; }

strict.caller;    // !!! BAD IDEA
strict.arguments; // !!! BAD IDEA
strict().caller;  // !!! BAD IDEA
strict().callee;  // !!! BAD IDEA
outer();          // !!! BAD IDEA

Really, it’s best not to access the caller of a function, the current function (except by naming it), or the arguments for a given function (except via arguments or by use of the named parameter) at all.

ES5 strict mode: self-limitation, not wish fulfillment

ES5 introduces the curious concept of strict mode. Strict mode, whose name and concept derive from the similar feature in Perl, is a new feature in ES5 whose purpose is to deliberately reduce the things you can do in JavaScript. Instead of a feature, it’s really more the absence of several features, within the scope of the strict-annotated code: with, particularly intrusive forms of eval, silent failure of writes to non-writable properties, silent failure of deletions of non-configurable properties, implicit creation of global-object properties, and so on. The goal of these removals is to simplify both the reasoning necessary to understand such code and the implementation of it in JavaScript engines: to sand away some rough edges in the language.

Magical stack-introspective properties of functions and arguments

Consider this code, and note the expected behavior (expected per the web, but not as part of ES3):

function getSelf() { return arguments.callee; }
assertEq(getSelf(), getSelf); // arguments.callee is the enclosing function

function outer()
{
  function inner() { return arguments.callee.caller; } // inner.caller === outer
  return inner();
}
assertEq(outer(), outer); // fun.caller === nearest function in stack that called fun, or null

function args2()
{
  return args2.arguments;
}
assertEq(args2(17)[0], 17); // fun.arguments === arguments object for nearest call to fun in stack, or null

Real-world JavaScript implementations take many shortcuts for top performance. These shortcuts are not (supposed to be) observable, except by timing the relevant functionality. Two common optimizations are function inlining and avoiding creating an arguments object. The above “features” play havoc with both of these optimizations (as well as others, one of which will be the subject of a forthcoming post).

Inlining a function should conceptually be equivalent to splatting the function’s contents in that location in the calling function and doing some α-renaming to ensure no names in the splatted body conflict with the surrounding code. The ability to access the calling function defeats this: there’s no function being invoked any more, so what does it even mean to ask for the function’s caller? (Don’t simply say you’d hard-code the surrounding function: how do you know which property lookups in the inlined code will occur upon precisely the function being called, looking for precisely the caller property?) It is also possible to access a function’s arguments through fun.arguments. While the “proper” behavior here is more obvious, implementing it would be a large hassle: either the arguments would have to be created when the function was inlined (in the general case where you can’t be sure the function will never be used this way), or you’d have to inline code in such a way as to be able to “work backward” to the argument values.

Speaking of arguments, in offering access to the corresponding function via arguments.callee it has the same problems as fun.caller. It also presents one further problem: in (some, mostly old) engines, arguments.caller provides access to the variables declared within that function when it was most recently called. (If you’re thinking security/integrity/optimization hazard, well, you now know why engines no longer support it.)

In sum these features are bad for optimization. Further, since they’re a form of dynamic scoping, they’re basically bad style in many other languages already.

Per ES5, SpiderMonkey no longer supports this stack-inspecting magic when it interacts with strict mode

As of the most recent Firefox nightly, SpiderMonkey now rejects code like that given above when it occurs in strict mode (more or less). (The properties in question are now generally implemented through a so-called “poison pill” mechanism, an immutable accessor property which throws a TypeError when retrieved or set.) The specific scenarios which we reject are as follows.

First, attempts to access the caller or arguments (except by directly naming the object) of a strict mode function throw a TypeError, because these properties are poison pills:

function strict()
{
  "use strict";
  strict.caller;    // !!! TypeError
  strict.arguments; // !!! TypeError
  return arguments; // direct name: perfectly cromulent
}
strict();

Second, attempts to access the enclosing function or caller variables via the arguments of a strict mode function throw a TypeError. These properties too are poison pills:

function strict()
{
  "use strict";
  arguments.callee; // !!! TypeError
  arguments.caller; // !!! TypeError
}
strict();

Third (and most trickily, because non-strict code is affected), attempts to access a function’s caller when that caller is in strict mode will throw a TypeError. This isn’t a poison pill, because if the "use strict" directive weren’t there it would still “work”:

function outer()
{
  "use strict";
  return inner();
}
function inner()
{
  return inner.caller; // !!! TypeError
}
outer();

But if there’s no strict mode in sight, nothing will throw exceptions, and what worked before will still work:

function fun()
{
  assertEq(fun.caller, null); // global scope
  assertEq(fun.arguments, arguments);
  assertEq(arguments.callee, fun);
  arguments.caller; // won't throw, won't do anything special
  return arguments;
}
fun();

Conclusion

With these changes, which are required by ES5, stack inspection is slowly going the way of the dodo. Don’t use it or rely on it! Even if you never use strict mode, beware the third change, for it still might affect you if you provide methods for other code to call. (But don’t expect to be able to avoid strict mode forever: I expect all JavaScript libraries will adopt strict mode in short order, given its benefits.)

(For those curious about new Error().stack, we’re still considering what to do about it. Regrettably, we may need to kill it for information-privacy reasons too, or at least censor it to something less useful. Nothing’s certain yet; stay tuned for further announcements should we make changes.)

You can experiment with a version of Firefox with these changes by downloading a TraceMonkey branch nightly; these changes should also make their way into mozilla-central nightlies shortly, if you’d rather stick to trunk builds. (Don’t forget to use the profile manager if you want to keep the settings you use with your primary Firefox installation pristine.)

22.08.10

Incompatible ES5 change: literal getter and setter functions must now have exactly zero or one arguments

ECMAScript accessor syntax in SpiderMonkey

For quite some time SpiderMonkey and Mozilla-based browsers have supported user-defined getter and setter functions (collectively, accessors), both programmatically and syntactically. The syntaxes for accessors were once legion, but SpiderMonkey has pared them back almost to the syntax recently codified in ES5 (and added new syntax where required by ES5).

// All valid in ES5
var a = { get x() { } };
var b = { get "y"() { } };
var c = { get 2() { } };

var e = { set x(v) { } };
var f = { set "y"(v) { } };
var g = { set 2(v) { } };

SpiderMonkey has historically parsed literal accessors using a slightly-tweaked version of its function parsing code. Therefore, as previously explained SpiderMonkey would accept essentially anything which could follow function in a function expression as valid accessor syntax in object literals.

ES5 requires accessors have exact numbers of arguments

A consequence of parsing accessors using generalized function parsing is that SpiderMonkey accepted some nonsensicalities, such as no-argument setters or multiple-argument getters or setters:

var o1 = { get p(a, b, c, d, e, f, g) { /* why have any arguments? */ } };
var o2 = { set p() { /* to what value? */ } };
var o3 = { set p(a, b, c) { /* why more than one? */ } };

ES5 accessor syntax sensibly deems such constructs errors: a conforming ES5 implementation would reject all of the above statements.

SpiderMonkey is changing to follow ES5: getters require no arguments, setters require one argument

SpiderMonkey has now been changed to follow ES5. There seemed little to no gain in continuing to support bizarre numbers of arguments when the spec counseled otherwise, and any code which does end up broken is easily fixed.

As always, you can experiment with a version of Firefox with these changes to accessor syntax by downloading a nightly from nightly.mozilla.org. (Don’t forget to use the profile manager if you want to keep the settings you use with your primary Firefox installation pristine.)

15.10.09

My distro can beat up your distro’s honor student. Or something like that. (Or: setting up ccache-powered Firefox builds in Fedora)

Tags: , , , , , , — Jeff @ 22:23

dholbert makes a recent post (well, recent only in planet.mozilla.org‘s little mind, no idea why a post from September 2008 is being displayed as new!) discussing how to build Firefox with ccache on Ubuntu, saving compilation time on close to null-program rebuilds. Cool beans. However:

If you’re on Fedora 11 (conceivably earlier too, I regretfully haven’t regularly used Fedora since Fedora 6, until recently), the basic developer tools package combo includes ccache, and caching Just Works in Firefox builds with no extra work needed at all.

[jwalden@the-great-waldo-search dbg]$ \
> ls -la `which g++` `which c++` `which gcc` /usr/bin/ccache
-rwxr-xr-x. 1 root root 43584 2009-02-23 23:42 /usr/bin/ccache
lrwxrwxrwx. 1 root root    16 2009-10-02 21:29 /usr/lib64/ccache/c++ -> ../../bin/ccache
lrwxrwxrwx. 1 root root    16 2009-10-02 21:29 /usr/lib64/ccache/g++ -> ../../bin/ccache
lrwxrwxrwx. 1 root root    16 2009-10-02 21:29 /usr/lib64/ccache/gcc -> ../../bin/ccache
[jwalden@the-great-waldo-search dbg]$ du -hs ~/.ccache
883M	/home/jwalden/.ccache

Anyway, use whichever distro you want, with ccache or without, whatever satisfies your preferences and utility curve. (The semi-troll title is completely gratuitous, but my sense of humor mandated I use it. 🙂 ) As for me: I am an absolute sucker for convenience. I’ve known of ccache for years and never used it before due to the activation energy needed to do so; had using ccache required equivalent effort in Fedora I strongly doubt I’d ever have used it. Score one for making the right choice for the user rather than requiring him to make it himself.

« Newer