09.11.11

Introducing MOZ_DELETE, a macro improving upon the deliberately-unimplemented method idiom

C++ default operators and the sole-ownership idiom

Often a C++ class will solely manage some value: for example, a GtkWindow* or a void* for malloc‘d memory. The class will then release ownership in its destructor as appropriate. It would be extremely problematic to release ownership multiple times — think security-vulnerability-problematic. C++ copy construction and default assignment exacerbate this issue, because C++ automatically generates these methods for all classes, even when the default behavior breaks sole-ownership. The C++98 idiom solving this is to privately declare a copy constructor and a default assignment operator, then never define them:

struct Struct
{
  private:
    Struct(const Struct& other);
    void operator=(const Struct& other);
};

Declaring the methods privately prevents any code but friends of Struct from calling them. And by never defining them, even such friends will cause a link-time error if they try.

Disabling the default operators in C++11

Once you’re familiar with this idiom it’s not too bad. But initially, it’s pretty unclear. And nothing prevents someone from actually defining these methods. (They could only be used by Struct or friends of Struct, to be sure, but for sufficiently complex code it’s possible someone might make a mistake.) C++11 improves upon this trick by introducing deleted function syntax:

struct Struct
{
  private: // no longer necessary, but doesn't hurt
    Struct(const Struct& other) = delete;
    void operator=(const Struct& other) = delete;
};

Deleted functions are effectively removed from name lookup: using, defining, or referring to a deleted function produces a compile error — far better than a link error or, even worse, no error at all.

= delete support in mfbt

The Mozilla Framework Based on Templates now includes support for declaring a function only to prevent its use (or use of an inherited version). The MOZ_DELETE macro encapsulates this support:

#include "mozilla/Types.h" // MOZ_DELETE has since moved...
#include "mozilla/Attributes.h" // ...to here

struct Struct
{
  private:
    Struct(const Struct& other) MOZ_DELETE;
    void operator=(const Struct& other) MOZ_DELETE;
};

MOZ_DELETE isn’t as readable or understandable as = delete, but it’s searchable, and the comment next to its definition will clarify matters. If the declarations are private, MOZ_DELETE is just as good as the traditional idiom, and in compilers supporting C++11 deleted functions it’s better.

Which compilers support C++11 deleted functions? I’m aware of GCC since 4.4, Clang since 2.9, and ICC since 12.0. Rightly, if unfortunately, you must specify -std=c++0x or similar to use deleted function syntax without causing a warning. For various reasons Mozilla can’t do that yet, so MOZ_DELETE only produces the C++11 syntax when compiling with Clang (where we can pass -Wno-c++0x-extensions to disable the warning). I’d love to see it use C++11 syntax in GCC and ICC as well, but I don’t have the time to solve the -std=c++0x problem now, or to figure out another workaround. I’ve filed bug 701183 for this problem — help there is much appreciated.

Summary

Use MOZ_DELETE when declaring any method you will intentionally not implement. It’ll work better, and produce better errors, in some compilers. Those compilers don’t include GCC or ICC yet, but with your help they could. Any takers?

Update, evening of November 10, 2011: I just landed further changes to make MOZ_DELETE use C++11 syntax with GCC when compiling with -std=c++0x (which we apparently do more often than I’d thought), so you should now get its goodness in GCC as well — most of the time. In some “exotic” situations we don’t compile anything with -std=c++0x, so you won’t get any benefit there. Also, the JavaScript engine is never compiled with it. So if you want this to work fully, everywhere, you should use Clang.

20.10.11

Computing the length or end of a compile-time constant-length array: {JS,NS}_ARRAY_{END,LENGTH} are out, mozilla::Array{Length,End} are in

Tags: , , , , , , , — Jeff @ 16:03

Determining the length of a fixed-length array

Suppose in C++ you want to perform variations of some simple task several times. One way to do this is to loop over the variations in an array to perform each task:

/* Defines the necessary envvars to 1. */
int setVariables()
{
  static const char* names[] = { "FOO", "BAR", "BAZ", "QUUX", "EIT", "GOATS" };
  for (int i = 0; i < 6; i++)
    if (0 > setenv(names[i], "1")) return -1;
  return 0;
}

Manually looping by index is prone to error. One particular issue with loop-by-index is that you must correctly compute the extent of iteration. Hard-coding a constant works, but what if the array changes? The constant must also change, which isn’t obvious to someone not looking carefully at all uses of the array.

The traditional way to get an array’s length is with a macro using sizeof:

#define NS_ARRAY_LENGTH(arr)  (sizeof(arr) / sizeof((arr)[0]))

This works but has problems. First, it’s a macro, which means it has the usual macro issues. For example, macros are untyped, so you can pass in “wrong” arguments to them and may not get type errors. This is the second problem: NS_ARRAY_LENGTH cheerfully accepts non-array pointers and returns a completely bogus length.

  const char* str = "long enough string";
  char* copy = (char*) malloc(NS_ARRAY_LENGTH(str)); // usually 4 or 8
  strcpy(copy, str); // buffer overflow!

Introducing mozilla::ArrayLength and mozilla::ArrayEnd

Seeing an opportunity to both kill a moderately obscure macro and to further improve the capabilities of the Mozilla Framework Based on Templates, I took it. Now, rather than use these macros, you can #include "mozilla/Util.h" and use mozilla::ArrayLength to compute the length of a compile-time array. You can also use mozilla::ArrayEnd to compute arr + ArrayLength(arr). Both these methods (not macros!) use C++ template magic to only accept arrays with compile-time-fixed length, failing to compile if something else is provided.

Limitations

Unfortunately, ISO C++ limitations make it impossible to write a method completely replacing the macro. So the macros still exist, and in rare cases they remain the correct answer.

The array can’t depend on an unnamed type (class, struct, union) or a local type

According to C++ §14.3.1 paragraph 2, “A local type [or an] unnamed type…shall not be used as a template-argument for a template type-parameter.” C++ makes this a compile error:

size_t numsLength()
{
  // unnamed struct, also locally defined
  static const struct { int i; } nums[] = { { 1 }, { 2 }, { 3 } };

  return mozilla::ArrayLength(nums);
}

It’s easy to avoid both limitations: move local types to global code, and name them.

// now defined globally, and with a name
struct Number { int i; };
size_t numsLength()
{
  static const Number nums[] = { 1, 2, 3 };
  return mozilla::ArrayLength(nums);
}

mozilla::ArrayLength(arr) isn’t a constant, NS_ARRAY_LENGTH(arr) is

Some contexts in C++ require a compile-time constant expression: template parameters, (in C++ but not C99) for local array lengths, for array lengths in typedefs, for the value of enum initializers, for static/compile-time assertions (which are usually bootstrapped off these other locations), and perhaps others. A function call, even one evaluating to a compile-time constant, is not a constant expression.

One other context doesn’t require a constant but strongly wants one: the values of static variables, inside classes and methods and out. If the value is a function call, even if it computes a constant, the compiler might make it a static initialization, delaying startup.

The long and short of it is that everything in the code below is a bad idea:

int arr[] = { 1, 2, 3, 5 };
static size_t len = ArrayLength(arr); // not an error, but don't do it
void t(JSContext* cx)
{
  js::Vector<int, ArrayLength(arr)> v(cx); // non-constant template parameter
  int local[ArrayLength(arr)]; // variadic arrays not okay in C++
  typedef int Mirror[ArrayLength(arr)]; // non-constant array length
  enum { L = ArrayLength(arr); }; // non-constant initializer
  PR_STATIC_ASSERT(4 == ArrayLength(arr)); // special case of one of the others
}

In these situations you should continue to use NS_ARRAY_LENGTH (or in SpiderMonkey, JS_ARRAY_LENGTH).

mozilla/Util.h is fragile with respect to IPDL headers, for include order

mozilla/Util.h includes mozilla/Types.h, which includes jstypes.h, which includes jsotypes.h, which defines certain fixed-width integer types: int8, int16, uint8, uint16, and so on. It happens that ipc/chromium/src/base/basictypes.h also defines these integer types — but incompatibly on Windows only. This header is, alas, included through every IPDL-generated header. In order to safely include any mfbt header in a file which also includes an IPDL-generated header, you must include the IPDL-generated header first. So when landing patches using mozilla/Util.h, watch out for Windows-specific bustage.

Removing the limitations

The limitations on the type of elements in arrays passed to ArrayLength are unavoidable limitations of C++. But C++11 removes these limitations, and compilers will likely implement support fairly quickly. When that happens we’ll be able to stop caring about the local-or-unnamed problem, not even needing to work around it.

The compile-time-constant limitation is likewise a limitation of C++. It too will go away in C++11 with the constexpr keyword. This modifier specifies that a function provided constant arguments computes a constant. The compiler must allow calls to the function that have constant arguments to be used as compile-time constants. Thus when compilers support constexpr, we can add it to the declaration of ArrayLength and begin using ArrayLength in compile-time-constant contexts. This is more low-hanging C++11 fruit that compilers will pick up soon. (Indeed, GCC 4.6 already implements it.)

Last, we have the Windows-specific #include ordering requirement. We have some ideas for getting around this problem, and we hope to have a solution soon.

A gotcha

Both these methods have a small gotcha: their behavior may not be intuitive when applied to C strings. What does sizeof("foo") evaluate to? If you think of "foo" as a string, you might say 3. But in reality "foo" is much better thought of as an array — and strings are '\0'-terminated. So actually, sizeof("foo") == 4. This was the case with NS_ARRAY_LENGTH, too, so it’s not new behavior. But if you use these methods without considering this, you might end up misusing them.

Conclusion

Avoid NS_ARRAY_LENGTH when possible, and use mozilla::ArrayLength or mozilla::ArrayEnd instead. And watch out when using them on strings, because their behavior might not be what you wanted.

(Curious how these methods are defined, and what C++ magic is used? See my next post.)

« Newer