12.08.13

Micro-feature from ES6, now in Firefox Aurora and Nightly: binary and octal numbers

A couple years ago when SpiderMonkey’s implementation of strict mode was completed, I observed that strict mode forbids octal number syntax. There was some evidence that novice programmers used leading zeroes as alignment devices, leading to unexpected results:

var sum = 015 + // === 13, not 15!
          197;
// sum === 210, not 212

But some users (Mozilla extensions and server-side node.js packages in particular) still want octal syntax, usually for file permissions. ES6 thus adds new octal syntax that won’t trip up novices. Hexadecimal numbers are formed with the prefix 0x or 0X followed by hexadecimal digits. Octal numbers are similarly formed using 0o or 0O followed by octal digits:

var DEFAULT_PERMS = 0o644; // kosher anywhere, including strict mode code

(Yes, it was intentional to allow the 0O prefix [zero followed by a capital O] despite its total unreadability. Consistency trumped readability in TC39, as I learned when questioning the wisdom of 0O as prefix. I think that decision is debatable, and the alternative is certainly not “nanny language design”. But I don’t much care as long as I never see it. 🙂 I recommend never using the capital version and applying a cluestick to anyone who does.)

Some developers also need binary syntax, which ECMAScript has never provided. ES6 thus adds analogous binary syntax using the letter b (lowercase or uppercase):

var FLT_SIGNBIT  = 0b10000000000000000000000000000000;
var FLT_EXPONENT = 0b01111111100000000000000000000000;
var FLT_MANTISSA = 0b00000000011111111111111111111111;

Try out both new syntaxes in Firefox Aurora or, if you’re feeling adventurous, in a Firefox nightly. Use the profile manager if you don’t want your regular Firefox browsing history touched.

If you’ve ever needed octal or binary numbers, hopefully these additions will brighten your day a little. 🙂

05.08.13

New in Firefox 23: the length property of an array can be made non-writable (but you shouldn’t do it)

Properties and their attributes

Properties of JavaScript objects include attributes for enumerability (whether the property shows up in a for-in loop on the object) and configurability (whether the property can be deleted or changed in certain ways). Getter/setter properties also include get and set attributes storing those functions, and value properties include attributes for writability and value.

Array properties’ attributes

Arrays are objects; array properties are structurally identical to properties of all other objects. But arrays have long-standing, unusual behavior concerning their elements and their lengths. These oddities cause array properties to look like other properties but behave quite differently.

The length property

The length property of an array looks like a data property but when set acts like an accessor property.

var arr = [0, 1, 2, 3];
var desc = Object.getOwnPropertyDescriptor(arr, "length");
print(desc.value); // 4
print(desc.writable); // true
print("get" in desc); // false
print("set" in desc); // false

print("0" in arr); // true
arr.length = 0;
print("0" in arr); // false (!)

In ES5 terms, the length property is a data property. But arrays have a special [[DefineOwnProperty]] hook, invoked whenever a property is added, set, or modified, that imposes special behavior on array length changes.

The element properties of arrays

Arrays’ [[DefineOwnProperty]] also imposes special behavior on array elements. Array elements also look like data properties, but if you add an element beyond the length, it’s as if a setter were called — the length grows to accommodate the element.

var arr = [0, 1, 2, 3];
var desc = Object.getOwnPropertyDescriptor(arr, "0");
print(desc.value); // 0
print(desc.writable); // true
print("get" in desc); // false
print("set" in desc); // false

print(arr.length); // 4
arr[7] = 0;
print(arr.length); // 8 (!)

Arrays are unlike any other objects, and so JS array implementations are highly customized. These customizations allow the length and elements to act as specified when modified. They also make array element access about as fast as array element accesses in languages like C++.

Object.defineProperty implementations and arrays

Customized array representations complicate Object.defineProperty. Defining array elements isn’t a problem, as increasing the length for added elements is long-standing behavior. But defining array length is problematic: if the length can be made non-writable, every place that modifies array elements must respect that.

Most engines’ initial Object.defineProperty implementations didn’t correctly support redefining array lengths. Providing a usable implementation for non-array objects was top priority; array support was secondary. SpiderMonkey’s initial Object.defineProperty implementation threw a TypeError when redefining length, stating this was “not currently supported”. Fully-correct behavior required changes to our object representation.

Earlier this year, Brian Hackett’s work in bug 827490 changed our object representation enough to implement length redefinition. I fixed bug 858381 in April to make Object.defineProperty work for array lengths. Those changes will be in Firefox 23 tomorrow.

Should you make array lengths non-writable?

You can change an array’s length without redefining it, so the only new capability is making an array length non-writable. Compatibility aside, should you make array lengths non-writable? I don’t think so.

Non-writable array length forbids certain operations:

  • You can’t change the length.
  • You can’t add an element past that length.
  • You can’t call methods that increase (e.g. Array.prototype.push) or decrease (e.g. Array.prototype.pop) the length. (These methods do sometimes modify the array, in well-specified ways that don’t change the length, before throwing.)

But these are purely restrictions. Any operation that succeeds on an array with non-writable length, succeeds on the same array with writable length. You wouldn’t do any of these things anyway, to an array whose length you’re treating as fixed. So why mark it non-writable at all? There’s no functionality-based reason for good code to have non-writable array lengths.

Fixed-length arrays’ only value is in maybe permitting optimizations dependent on immutable length. Making length immutable permits minimizing array elements’ memory use. (Arrays usually over-allocate memory to avoid O(n2) behavior when repeatedly extending the array.) But if it saves memory (this is highly allocator-sensitive), it won’t save much. Fixed-length arrays may permit bounds-check elimination in very circumscribed situations. But these are micro-optimizations you’d be hard-pressed to notice in practice.

In conclusion: I don’t think you should use non-writable array lengths. They’re required by ES5, so we’ll support them. But there’s no good reason to use them.

30.04.13

Introducing mozilla::Abs to mfbt

Tags: , , , , , , , , , — Jeff @ 08:17

Computing absolute values in C/C++

C includes various functions for computing the absolute value of a signed number. C++98 implementations add the C functions to namespace std, and it adds abs() overloads to namespace std so std::abs works on everything. For a long time Mozilla used NS_ABS to compute absolute value, but recently we switched to std::abs. This works on many systems, but it has a few issues.

Issues with std::abs

std::abs is split across two headers

With some compilers, the integral overloads are in <cstdlib> and the floating point overloads are in <cmath>. This led to confusion when std::abs compiled on one type but not on another, in the same file. (Or worse, when it worked with just one #include because of that developer’s compiler.) The solution was to include both headers even if only one was needed. This is pretty obscure.

std::abs(int64_t) doesn’t work everywhere

On many systems <stdint.h> has typedef long long int64_t;. But long long was only added in C99 and C++11, and some compilers don’t have long long std::abs(long long), so int64_t i = 0; std::abs(i); won’t compile. We “solved” this with compiler-specific #ifdefs around custom std::abs specializations in a somewhat-central header. (That’s three headers to include!) C++ says this has undefined behavior, and indeed it’ll break as we update compilers.

std::abs(int32_t(INT32_MIN)) doesn’t work

The integral abs overloads don’t work on the most-negative value of each signed integer type. On twos-complement machines (nearly everything), the absolute value of the smallest integer of a signed type won’t fit in that type. (For example, INT8_MIN is -128, INT8_MAX is +127, and +128 won’t fit in int8_t.) The integral abs functions take and return signed types. If the smallest integer flows through, behavior is undefined: as absolute-value is usually implemented, the value is returned unchanged. This has caused Mozilla bugs.

Mozilla code should use mozilla::Abs, not std::abs

Unfortunately the only solution is to implement our own absolute-value function. mozilla::Abs in "mozilla/MathAlgorithms.h" is overloaded for all signed integral types and the floating point types, and the integral overloads return the unsigned type. Thus you should use mozilla::Abs to compute absolute values. Be careful about signedness: don’t assign directly into a signed type! That loses mozilla::Abs‘s ability to accept all inputs and will cause bugs. Ideally this would be a compiler warning, but we don’t use -Wconversion or Microsoft equivalents and so can’t do better.

26.04.13

mozilla/PodOperations.h: functions for zeroing, assigning to, copying, and comparing plain old data objects

Tags: , , , , , , , — Jeff @ 13:20

Recently I introduced the new header mozilla/PodOperations.h to mfbt, moving its contents out of SpiderMonkey so for general use. This header makes various operations on memory for objects easier and safer.

The problem

Often in C or C++ one might want to set the contents of an object to zero — perhaps to initialize it:

mMetrics = new gfxFont::Metrics;
::memset(mMetrics, 0, sizeof(*mMetrics));

Or perhaps the same might need to be done for a range of objects:

memset(mTreeData.Elements(), 0, mTreeData.Length() * sizeof(mTreeData[0]));

Or perhaps one might want to set the contents of an object to those of another object:

memcpy(&e, buf, sizeof(e));

Or perhaps a range of objects must be copied:

memcpy(to + aOffset, aBuffer, aLength * sizeof(PRUnichar));

Or perhaps a range of objects must be memory-equivalence-compared:

return memcmp(k->chars(), l->chars(), k->length() * sizeof(jschar)) == 0;

What do all these cases have in common? They all require using a sizeof() operation.

The problem

C and C++, as low-level languages very much focused on the actual memory, place great importance in the size of an object. Programmers often think much less about sizes. It’s pretty easy to write code without having to think about memory. But some cases require it, and because it doesn’t happen regularly, it’s easy to make mistakes. Even experienced programmers can screw it up if they don’t think carefully.

This is particularly likely for operations on arrays of objects. If the object’s size isn’t 1, forgetting a sizeof means an array of objects might not be completely cleared, copied, or compared. This has led to Mozilla security bugs in the past. (Although, the best I can find now is bug 688877, which doesn’t use quite the same operations, and can’t be solved with these methods, but which demonstrates the same sort of issue.)

The solution

Using the prodigious magic of C++ templates, the new mfbt/PodOperations.h abstracts away the sizeof in all the above examples, implements bounds-checking assertions as appropriate, and is type-safe (doesn’t require implicit casts to void*).

  • Zeroing
    • PodZero(T* t): set the contents of *t to 0
    • PodZero(T* t, size_t count): set the contents of count elements starting at t to 0
    • PodArrayZero(T (&t)[N]): set the contents of the array t (with a compile-time size) to 0
  • Assigning
    • PodAssign(T* dst, const T* src): set the contents of *dst to *src — locations can’t overlap (no self-assignments)
  • Copying
    • PodCopy(T* dst, const T* src, size_t count): copy count elements starting at src to dst — ranges can’t overlap
  • Comparison
    • PodEqual(const T* one, const T* two, size_t len): true or false indicating whether len elements at one are memory-identical to len elements at two

Random questions

Why “Pod”?

POD is a C++ term of art abbreviation for “plain old data”. A type that’s plain old data is, roughly: a built-in type; a pointer or enum that’s represented like a built-in type; a user-defined class without any virtual methods or inheritance or user-defined constructors or destructors (including in any of its base classes), whose non-static members are themselves plain old data; or an array of a type that’s plain old data. (There are a couple other restrictions that don’t matter here and would take too long to explain anyway.)

One implication of a type being POD is that (systemic interactions aside) you can copy an object of that type using memcpy. The file and method names simply play on that. Arguably it’s not the best, clearest term in the world — especially as these methods aren’t restricted to POD types. (One intended use is for initializing classes that are non-POD, where the initial state is fully-zeroed.) But it roughly gets the job done, no better names quickly spring to mind, and renaming would have been pain without much gain.

What are all these “optimizations” in these methods?

When these operations were added to SpiderMonkey a few years ago, various people (principally Luke, if I remember right) benchmarked these operations when used in various places in SpiderMonkey. It turned out that “trivial” uses of memcmp, &c. wouldn’t always be optimally compiled by the compiler to fast, SIMD-optimizable loops. Thus we introduced special cases. Newer compilers may do better, such that we have less need for the optimizations. But the worst that happens with them is slightly slower code — not correctness bugs. If you have real-world data (inchoate fears don’t count 🙂 ) showing these optimizations aren’t needed now, file a bug and we can adapt them as needed.

17.07.12

37 awesome days

I tend to take very long vacations. Coding gives me the flexibility to work from anywhere, so when I travel, I keep working by default and take days off when something special arises. Thus I usually take vacation in very short increments, but very occasionally I’ll be gone awhile. And when I’m gone awhile, I’m gone: no hacking, no work, just focused on the instant.

My last serious-length vacation was August-September last year. And since then, I’ve taken only a day and a half of vacation (although I’ve shifted a few more days or fractions thereof to evenings or weekends). It’s time for a truly long vacation.

Screenshot of a browser showing Mozilla's PTO app, indicating 224 hours of PTO starting July 18
Yeah, I’m pretty much using it all up.

For several years I’ve had a list of long trips I’ve decided I will take: the Appalachian Trail, the John Muir Trail, the Coast to Coast Walk in England, and the Pacific Crest Trail. I’ve done the first two in 2008 and 2010 and the third last year. The fourth requires more than just a vacation, so I haven’t gotten to it yet. This leaves one last big trip: biking across the United States.

Tomorrow I take a much-needed break to recharge and recuperate (in a manner of speaking) by biking from the Pacific to the Atlantic. (Ironically, the first leg out of San Francisco is a ferry to Vallejo.) I have a commitment at the back end August 25 in San Francisco, and a less-critical one (more biking, believe it or not!) August 26. The 24th must be a day to fly back, so I have 37 days to bike the ~3784 miles of the Western Express Route (San Francisco, CA to Pueblo, CO) and part of the TransAmerica Trail (Pueblo to Yorktown, VA). This is an aggressive pace, to put it mildly; but I’ve biked enough hundred-mile days before, singly and seriatim, that I believe it’s doable with effort and focus.

Unlike in past trips, I won’t be incommunicado this time. I’ll pass through towns regularly, so I’ll have consistent ability to access the Internet. And I died a little, but I bought two months of cell/data service to cover the trip. So it goes. I won’t be regularly checking email (or bugmail, or doing reviews). But I’ll try to make a quick post from time to time with a picture and a few words.

I could say a little about gear — my twenty-five pound carrying capacity in panniers on a seatpost-mounted rack, the Kindle I purchased for reading end-of-day (which I’ve enjoyed considerably for the last week…as has my credit card), the 25-ounce sleeping bag I’ll carry, the tent I’ll use. I could also say a little about the hazards — the western isolation (you Europeans have no idea what that means), the western desert (one Utah day will be 50 miles without water, then 74 miles without water), the high summer climate, the other traffic, and simple exhaustion. But none of that’s important compared to the fact that 1) this is finally happening, and 2) it starts tomorrow.

“And now I think I am quite ready to go on another journey.” Let’s do this.

« NewerOlder »