More ES5 backwards-incompatible changes: regular expressions now evaluate to a new object, not the same object, each time they’re encountered

(preemptive clarification: coming in Firefox 3.7 and not Firefox 3.6, which is to say, a good half year away from now rather than Real Soon Now)

Disjunction: is /foo/ the same object, or a new object, each time it’s evaluated in ES3?

According to ECMA-262 3rd edition, what should this code print?

function getRegEx() { return /regex/; }
print("getRegEx() === getRegEx(): " + (getRegEx() === getRegEx()));

The answer depends upon this question: when a JavaScript regular expression literal is evaluated, does it create a new RegExp object each time, or does it evaluate to the exact same RegExp object each time it’s evaluated? Let’s look at a few examples and make a guess.

I sense a pattern

var tests =
   function getNull() { return null; },
   function getNumber() { return 1; },
   function getString() { return "a"; },
   function getBoolean() { return false; },
   function getObject() { return {}; },
   function getArray() { return []; },
   function getFunction() { return function f(){}; },

for (var i = 0, sz = tests.length; i < sz; i++)
  var t = tests[i];
  print(t.name + "() === " + t.name + "(): " + (t() === t()));

If you test that code, you’ll see that the first four results are true, and the rest are false, all per ECMA-262 3rd edition. (Okay, technically, and bizarrely, ES3 permitted either result for the function case, but no browser ever implemented a result of true; ES5 acknowledges reality and mandates that the result be false.) The first four functions return primitive values; the last three return objects. There’s only a single instance of any primitive value — or, alternately, you might say, equality doesn’t distinguish between different instances of the same primitive. Therefore it doesn’t really matter whether primitive literals evaluate to new instances or the same instance. On the other hand, objects compare equal only if they’re the same object. Since the object cases didn’t compare identically, they must be new objects each time. This makes sense: if this were not the case, what would happen in the following example?

function makePoint(x, y)
  var pt = {};
  pt.x = x;
  pt.y = y;
  return pt;

var pt1 = makePoint(1, 2);
var pt2 = makePoint(3, 4);

It would be complete nonsense if the object literal above evaluated to the same object every time it were encountered; the next two lines would blow away the previous point, and we would have pt1.x ===3 && pt1.y === 4.

Plausible assertion: regular expression literals evaluate to new objects when encountered?

Returning to the original question, then, what does ES3 say this code should print?

function getRegEx() { return /regex/; }
print("getRegEx() === getRegEx(): " + (getRegEx() === getRegEx()));

A regular expression is an object. If you don’t want to get weird property-poisoning of the sort just suggested, regular expression literals must evaluate to different objects each time they’re encountered, right?

Alternative: ES3 says /foo/ is the same object every time

Wrong. According to ES3, there’s only a single object for each regular expression literal that’s returned each time the literal is encountered:

A regular expression literal is an input element that is converted to a RegExp object (section 15.10) when it is scanned. The object is created before evaluation of the containing program or function begins. Evaluation of the literal produces a reference to that object; it does not create a new object.

ECMA-262, 3rd ed. 7.8.5 Regular Expression Literals

This was originally a dubious optimization in the standard to avoid the “costly” creation of a regular expression object every time a literal would be encountered. It’s perhaps a little surprising that the same object is returned each time, but does it make a difference in real programs not written to demonstrate the quirk? Often it doesn’t matter. As a simple example, if (/^\d+$/.test(str)) { /* ... */ } executes identically either way, assuming RegExp.prototype.test is unmodified. The RegExp never escapes, and its use doesn’t depend on mutable state, so creating new objects each time doesn’t make a difference (other than negligibly, in speed).

Sometimes, however, the shared-object misoptimization does matter meaningfully: when a RegExp with mutable state is used in ways that depend on that state. Most regular expressions don’t store any state, so if the same RegExp object is used twice it’s no big deal. However, it can matter a lot for regular expressions specified with the global flag:

var s = "abcddeeefffffgggggggghhhhhhhhhhhhh";
function next(s)
  var r = /(.)\1*/g;
  return r.lastIndex;

var r = [];
for (var i =0; i < 8; i++)
print(r.join(", "));

Each time a regular expression with the global flag is used, its lastIndex property is updated with the index of the location in the matched string where matching should resume when the regular expression is next used. Thus, in this example we have mutable state, and if next is called multiple times we have uses which will depend on that mutable state. Let’s see what happens in engines which implemented regular expression literals per ES3. If you download the Firefox 3.6 release candidate and test the above code in it (adjusting the implied print to alert), the printed result will be this:

1, 2, 3, 5, 8, 13, 21, 34

ES5: an escape to sanity

Is ES3’s behavior what you’d expect? No, it isn’t. In fact, ES3’s behavior, which Mozilla and SpiderMonkey implement, is the second-most duplicated bug filed against Mozilla’s JavaScript engine. SpiderMonkey and (strangely enough) v8 are the only notable JavaScript engines out there that implement ES3’s behavior. ES3’s behavior is rarely what web developers expect, and it doesn’t provide any real value, so ES5 is changing to the behavior you’d expect: evaluating a regular expression literal creates a new object every time.

Starting with Firefox 3.7, Firefox will implement what ES5 specifies. Download a Firefox nightly from nightly.mozilla.org and test it out as above (use the profile manager if you want to keep your current Firefox settings and install untouched). Instead of the Fibonacci sequence you’ll get this:

1, 1, 1, 1, 1, 1, 1, 1

The bottom line

Starting with Firefox 3.7, evaluating a regular expression literal like /foo/ will create a new RegExp object, just as evaluating {} or [] currently creates a new object or array. The optimization ES3 specified has resulted in clear developer confusion and was misguided and inconsistent with respect to other object literal syntax in JavaScript.

Again, as with my previous post, we doubt this change will affect many scripts (in this case, except for the better). The fact that few browsers implemented ES3’s semantics means that most sites have to cope with either choice of semantics, so the semantics in ES5, implemented by Mozilla for Firefox 3.7, are likely already handled. Still, it’s possible that this change might break some sites (particularly those which include browser-specific code), so we’re giving a heads-up as early as possible.


More ES5 backwards-incompatible changes: the global properties undefined, NaN, and Infinity are now immutable

(preemptive clarification: coming in Firefox 3.7 and not Firefox 3.6, which is to say, a good half year away from now rather than Real Soon Now)

JavaScript and the undefined, Infinity, and NaN “keywords”

Consider the following JavaScript program: what do you think it does?

print("undefined before: " + undefined);
undefined = 17;
print("undefined after:  " + undefined);

The above program will print this output:

undefined before: undefined
undefined after:  17

Surely you can’t be serious!

A sane person might think that this program isn’t even a program. Doesn’t undefined always refer to the primitive value undefined? After all, this “program” isn’t one, nor would be the same one for true or false, mutatis mutandis:

print("null before: " + null);
null = 17; // !!! NullLiteral is not a LeftHandSideExpression
print("null after:  " + null);

I am serious…and don’t call me Shirley

Curiously, the program that assigns to undefined is a valid JavaScript program, but programs that assign to null, true, and false are not. Why not? The latter are all keywords with intrinsic meaning within the language; undefined, on the other hand, is just a normal property of the global object. According to ECMA-262 3rd edition, if you assign a different value to undefined, that different value becomes the new value of undefined.

This is a clear botch in ES3. undefined should have been a keyword in JavaScript from the beginning; similarly, the global properties Infinity and NaN probably should have been keywords as well (or perhaps the properties should not have existed, given that Math.Infinity and Math.NaN exist and are immutable). ECMA-262 5th edition doesn’t quite go so far as to change these three properties into keywords due to backwards compatibility concerns (making that change would be guaranteed to break any programs that even tried to assign to those names, regardless whether the program relied on that assignment for correctness). Instead, it changes these properties to be read-only, in the same way that the various numeric properties on the Math object are read-only. Assigning to these properties in ES5 won’t do anything (unless you opt into strict mode, in which case a TypeError exception will be thrown after we fix bug 537873), but at least it won’t definitely and completely break existing programs that relied on this.

We’ve made this change in SpiderMonkey, and it is now in trunk builds of Firefox, slated for the eventual Firefox 3.7 release. Download a nightly build from nightly.mozilla.org and test out the change for yourself (use the profile manager if you want to keep your current Firefox settings and install untouched). This change should have no effect on the vast, vast majority of web developers who don’t try to change the values of these properties; as for the [civility and my religion require I redact this description] developers who did change the value of the global undefined, NaN, or Infinity properties, well…you had it coming.

The bottom line

The global properties undefined, Infinity, and NaN will be read-only and immutable in Firefox 3.7. Assigning to these properties will do nothing (except in strict mode where a TypeError exception will be thrown once we fix a bug) rather than changing their values. This shouldn’t break the vast, vast, vast majority of scripts out there — but there’s no way to guarantee it will break no one, so we think it’s worth announcing this backwards-incompatible change as proactively as possible.


ECMA-262 ed. 5 backwards-incompatible change coming to SpiderMonkey and to Gecko-based browsers

(preemptive clarification: coming in Firefox 3.7 and not Firefox 3.6, which is to say, a good half year away from now rather than Real Soon Now)

ES5 and compatibility

The fifth edition of ECMA-262, the next iteration of the JavaScript language, is broadly backwards-compatible with existing JavaScript code. Generally, code that worked in past browsers that implemented the specification will continue to work in new browsers as they implement the new edition of the specification. However, there are a few exceptions. The most obvious one is ES5’s strict mode, where specially-tagged scripts and functions will cause parsing and execution of their contents to occur under stricter requirements than in the past. For example, this code would have executed “as intended” in ES3, but in ES5 the definition of a variable named “arguments” inside a function in strict mode is a syntax error (none of the code even executes):

function strictModeError()
  "use strict";
  var arguments = 17; // stupid, but permissible, in ES3
  return arguments;
if (strictModeError() !== 17)
  throw new Error("up is down");

The above isn’t more than a theoretical problem as it is expected old code wouldn’t have accidentally opted into strict mode. Not all of ES5’s incompatible changes, however, are so benign.

ES5 compatibility with ES3 extensions

One unusual area of compatibility concerns not ES3, but extensions to ES3. One of the more profound changes in ES5 is the introduction of getters and setters, in which what appears syntactically to be a property when used actually will invoke function calls “under the hood”. Most major JS engines support this extension to ES3:

var o =
    get field() { return this._field; },
    set field(f)
      if (typeof f != "number")
        throw new Error("not a number");
      this._field = f;
    _field: 0
print(o.field); // 0
o.field = 5;
print(o.field); // 5
try { o.field = "0"; } catch (e) { /* throws: not a number */ }
print(o.field); // 5
o.field = 17;
print(o.field); // 17

This syntax is in ES5, partly because it addresses a need in a reasonable way but mostly because many developers will already be familiar with it. (Getters and setters were also available programmatically; ES5’s solution is different but more flexible.) However, not all aspects of getters and setters are present in ES5 in the same way they were in extensions to ES3 engines.

Assigning to getter-only properties in ES5

Consider the previous example, slightly tweaked:

var o =
    get field() { return this._field; },
    _field: 17
print(o.field); // 17
o.field = 5;  // ???
print(o.field); // ???

In this case the field property is read-only: you could analogize it to element.childNodes.length, which has a value which it makes no sense to attempt to change. What should happen, then, if you attempt to change it? Current browsers throw a TypeError when you try this. ES5, however, chooses to remain (arguably) more faithful to ES3 and instead makes setting a property that only has a getter do nothing — except in strict mode, where a TypeError exception will be thrown.

SpiderMonkey, the JavaScript engine embedded in Gecko browsers, has just made the switch from its previous always-throw behavior to ES5’s only-throw-if-in-strict-mode behavior when an attempt is made to set a property that only has a getter. (This will also apply to DOM properties like the aforementioned element.childNodes.length.) In the future, if you try to set a property that only has a getter, no exception will be thrown unless you’ve opted into strict mode. This change is now in trunk builds of Firefox; download a nightly build from nightly.mozilla.org and test out the change for yourself (use the profile manager if you want to keep your current Firefox settings and install untouched). If your site relies on an exception being thrown, this change could break it, and we’re hoping that an extended period of time to test the change will help developers iron out any reliance on this non-standard behavior. This change will first appear in Firefox 3.7, which probably won’t be released until the second half of 2010 or so. Firefox 3.6 preserves current behavior where an exception is always thrown, so you should have plenty of time to update your site in response to this change.

The bottom line

Firefox 3.6 and earlier throw an exception whenever you attempt to set a property represented only by a getter (this includes DOM properties defined as readonly). Firefox 3.7 will only throw a TypeError when assigning to a property represented by only a getter if the assignment occurs in ES5 strict mode code. This change will also apply to attempts to set readonly DOM properties like element.childNodes.length. If you’re relying on an exception being thrown in either case, change the assignment location code so that it works when no TypeError exception is thrown.

« Newer