16.11.11

Introducing MOZ_OVERRIDE to annotate virtual functions which override base-class virtual functions

Tags: , , , , , , , — Jeff @ 09:55

Overriding inherited virtual functions

One way C++ supports code reuse is through inheritance. One base class implements common functionality. Then other classes inherit from it, essentially copying functionality from it. These other classes can add their own new functionality, or, more powerfully, they can override the base class functionality.

class Base
{
  public:
    virtual const char* type() { return "Base"; }
};
class Derived : public Base
{
  public:
    virtual const char* type() { return "Derived"; }
};

Overriding base class functionality is simple. Keeping such overrides working correctly is sometimes harder. The problem is that the override relationship is implicit: if the override doesn’t exactly match the signature of the desired function in the base class, it may not work correctly.

class Base
{
  public:
    // Perhaps as part of an incomplete refactoring,
    // the base class's function changed its name.
    virtual const char* kind() { return "Base"; }
};
class DerivedIncorrectly : public Base
{
  public:
    virtual const char* type() { return "Derived"; }
};

// BAD: code expecting kind() to work and sometimes
// indicate Derived-ness no longer will.

Making the override relationship explicit

Some languages (Scala, C#, probably others) provide the ability to mark a derived class’s function as an override of an inherited function. C++98 included no such ability, but C++11 does, through the contextual override keyword. When override is used, that virtual member function must override one found on a base class. If it does not, it is a compile error.

class Base
{
  public:
    virtual const char* kind() { return "Base"; }
};
class DerivedIncorrectly : public Base
{
  public:
    // This will cause a compile error: there's no type()
    // method on Base that this overrides.
    virtual const char* type() override { return "Derived"; }

    // This will work as intended.
    virtual const char* kind() override { return "Derived"; }
};

Introducing MOZ_OVERRIDE

The Mozilla Framework Based on Templates now includes support for the C++11 contextual override keyword, encapsulated in the MOZ_OVERRIDE macro in mozilla/Types.hmozilla/Attributes.h. Simply place it at the end of the declaration of the relevant method, before any = 0 or method body, like so:

#include "mozilla/Types.h" // MOZ_OVERRIDE has since moved...
#include "mozilla/Attributes.h" // ...to here

class Base
{
  public:
    virtual void f() = 0;
};
class Derived1 : public Base
{
  public:
    virtual void f() MOZ_OVERRIDE;
};
class Derived2 : public Base
{
  public:
    virtual void f() MOZ_OVERRIDE = 0;
};
class Derived3 : public Base
{
  public:
    virtual void f() MOZ_OVERRIDE { }
};

MOZ_OVERRIDE will expand to use the C++11 construct in compilers which support it. Thus in such compilers misuse of MOZ_OVERRIDE is an error. Even better, some of the compilers used by tinderbox support override, so in many cases tinderbox will detect misuse. (Specifically, MSVC++ 2005 and later support it, so errors in cross-platform and Windows code won’t pass tinderbox . Much more recent versions of GCC and Clang support it as well, but these versions are too new for tinderbox to have picked them up yet — in the case of GCC too new to even have been released yet. 🙂 )

What about NS_OVERRIDE?

It turns out there’s already a macro annotation to indicate an override relationship: NS_OVERRIDE. This gunky XPCOM macro expands to a user attribute under gcc-like compilers. It’s only used by static analysis right now, so its value is limited. Unfortunately its position is different — necessarily so, because in the C++11 override position it would attach to the return value of the method:

class OldAndBustedDerived : public Base
{
  public:
    NS_OVERRIDE virtual void f(); // annotates the method
    __attribute__(...) virtual void g(); // its expansion
};
class Derived2 : public Base
{
  public:
    // But in the MOZ_OVERRIDE position, it would annotate
    // f()'s return value.
    virtual void f() __attribute__(...);
};

NS_OVERRIDE is now deprecated and should be replaced with MOZ_OVERRIDE. With a little work, static analysis with new-enough compilers can likely look for MOZ_OVERRIDE just as easily as for NS_OVERRIDE. And since MOZ_OVERRIDE works in non-static analysis builds, it’s arguably better in the majority of cases anyway. If you’re looking for an easy way to improve Mozilla code, changing NS_OVERRIDE uses to use MOZ_OVERRIDE would be a simple way to help.

Summary

If you’ve overridden an inherited virtual member function and you’re worried that that override might silently break at some point, annotate your override with MOZ_OVERRIDE. This will cause some compilers to enforce an override relationship, making it much less likely that your intended relationship will break.

09.11.11

Introducing MOZ_DELETE, a macro improving upon the deliberately-unimplemented method idiom

C++ default operators and the sole-ownership idiom

Often a C++ class will solely manage some value: for example, a GtkWindow* or a void* for malloc‘d memory. The class will then release ownership in its destructor as appropriate. It would be extremely problematic to release ownership multiple times — think security-vulnerability-problematic. C++ copy construction and default assignment exacerbate this issue, because C++ automatically generates these methods for all classes, even when the default behavior breaks sole-ownership. The C++98 idiom solving this is to privately declare a copy constructor and a default assignment operator, then never define them:

struct Struct
{
  private:
    Struct(const Struct& other);
    void operator=(const Struct& other);
};

Declaring the methods privately prevents any code but friends of Struct from calling them. And by never defining them, even such friends will cause a link-time error if they try.

Disabling the default operators in C++11

Once you’re familiar with this idiom it’s not too bad. But initially, it’s pretty unclear. And nothing prevents someone from actually defining these methods. (They could only be used by Struct or friends of Struct, to be sure, but for sufficiently complex code it’s possible someone might make a mistake.) C++11 improves upon this trick by introducing deleted function syntax:

struct Struct
{
  private: // no longer necessary, but doesn't hurt
    Struct(const Struct& other) = delete;
    void operator=(const Struct& other) = delete;
};

Deleted functions are effectively removed from name lookup: using, defining, or referring to a deleted function produces a compile error — far better than a link error or, even worse, no error at all.

= delete support in mfbt

The Mozilla Framework Based on Templates now includes support for declaring a function only to prevent its use (or use of an inherited version). The MOZ_DELETE macro encapsulates this support:

#include "mozilla/Types.h" // MOZ_DELETE has since moved...
#include "mozilla/Attributes.h" // ...to here

struct Struct
{
  private:
    Struct(const Struct& other) MOZ_DELETE;
    void operator=(const Struct& other) MOZ_DELETE;
};

MOZ_DELETE isn’t as readable or understandable as = delete, but it’s searchable, and the comment next to its definition will clarify matters. If the declarations are private, MOZ_DELETE is just as good as the traditional idiom, and in compilers supporting C++11 deleted functions it’s better.

Which compilers support C++11 deleted functions? I’m aware of GCC since 4.4, Clang since 2.9, and ICC since 12.0. Rightly, if unfortunately, you must specify -std=c++0x or similar to use deleted function syntax without causing a warning. For various reasons Mozilla can’t do that yet, so MOZ_DELETE only produces the C++11 syntax when compiling with Clang (where we can pass -Wno-c++0x-extensions to disable the warning). I’d love to see it use C++11 syntax in GCC and ICC as well, but I don’t have the time to solve the -std=c++0x problem now, or to figure out another workaround. I’ve filed bug 701183 for this problem — help there is much appreciated.

Summary

Use MOZ_DELETE when declaring any method you will intentionally not implement. It’ll work better, and produce better errors, in some compilers. Those compilers don’t include GCC or ICC yet, but with your help they could. Any takers?

Update, evening of November 10, 2011: I just landed further changes to make MOZ_DELETE use C++11 syntax with GCC when compiling with -std=c++0x (which we apparently do more often than I’d thought), so you should now get its goodness in GCC as well — most of the time. In some “exotic” situations we don’t compile anything with -std=c++0x, so you won’t get any benefit there. Also, the JavaScript engine is never compiled with it. So if you want this to work fully, everywhere, you should use Clang.

20.10.11

Computing the length or end of a compile-time constant-length array: {JS,NS}_ARRAY_{END,LENGTH} are out, mozilla::Array{Length,End} are in

Tags: , , , , , , , — Jeff @ 16:03

Determining the length of a fixed-length array

Suppose in C++ you want to perform variations of some simple task several times. One way to do this is to loop over the variations in an array to perform each task:

/* Defines the necessary envvars to 1. */
int setVariables()
{
  static const char* names[] = { "FOO", "BAR", "BAZ", "QUUX", "EIT", "GOATS" };
  for (int i = 0; i < 6; i++)
    if (0 > setenv(names[i], "1")) return -1;
  return 0;
}

Manually looping by index is prone to error. One particular issue with loop-by-index is that you must correctly compute the extent of iteration. Hard-coding a constant works, but what if the array changes? The constant must also change, which isn’t obvious to someone not looking carefully at all uses of the array.

The traditional way to get an array’s length is with a macro using sizeof:

#define NS_ARRAY_LENGTH(arr)  (sizeof(arr) / sizeof((arr)[0]))

This works but has problems. First, it’s a macro, which means it has the usual macro issues. For example, macros are untyped, so you can pass in “wrong” arguments to them and may not get type errors. This is the second problem: NS_ARRAY_LENGTH cheerfully accepts non-array pointers and returns a completely bogus length.

  const char* str = "long enough string";
  char* copy = (char*) malloc(NS_ARRAY_LENGTH(str)); // usually 4 or 8
  strcpy(copy, str); // buffer overflow!

Introducing mozilla::ArrayLength and mozilla::ArrayEnd

Seeing an opportunity to both kill a moderately obscure macro and to further improve the capabilities of the Mozilla Framework Based on Templates, I took it. Now, rather than use these macros, you can #include "mozilla/Util.h" and use mozilla::ArrayLength to compute the length of a compile-time array. You can also use mozilla::ArrayEnd to compute arr + ArrayLength(arr). Both these methods (not macros!) use C++ template magic to only accept arrays with compile-time-fixed length, failing to compile if something else is provided.

Limitations

Unfortunately, ISO C++ limitations make it impossible to write a method completely replacing the macro. So the macros still exist, and in rare cases they remain the correct answer.

The array can’t depend on an unnamed type (class, struct, union) or a local type

According to C++ §14.3.1 paragraph 2, “A local type [or an] unnamed type…shall not be used as a template-argument for a template type-parameter.” C++ makes this a compile error:

size_t numsLength()
{
  // unnamed struct, also locally defined
  static const struct { int i; } nums[] = { { 1 }, { 2 }, { 3 } };

  return mozilla::ArrayLength(nums);
}

It’s easy to avoid both limitations: move local types to global code, and name them.

// now defined globally, and with a name
struct Number { int i; };
size_t numsLength()
{
  static const Number nums[] = { 1, 2, 3 };
  return mozilla::ArrayLength(nums);
}

mozilla::ArrayLength(arr) isn’t a constant, NS_ARRAY_LENGTH(arr) is

Some contexts in C++ require a compile-time constant expression: template parameters, (in C++ but not C99) for local array lengths, for array lengths in typedefs, for the value of enum initializers, for static/compile-time assertions (which are usually bootstrapped off these other locations), and perhaps others. A function call, even one evaluating to a compile-time constant, is not a constant expression.

One other context doesn’t require a constant but strongly wants one: the values of static variables, inside classes and methods and out. If the value is a function call, even if it computes a constant, the compiler might make it a static initialization, delaying startup.

The long and short of it is that everything in the code below is a bad idea:

int arr[] = { 1, 2, 3, 5 };
static size_t len = ArrayLength(arr); // not an error, but don't do it
void t(JSContext* cx)
{
  js::Vector<int, ArrayLength(arr)> v(cx); // non-constant template parameter
  int local[ArrayLength(arr)]; // variadic arrays not okay in C++
  typedef int Mirror[ArrayLength(arr)]; // non-constant array length
  enum { L = ArrayLength(arr); }; // non-constant initializer
  PR_STATIC_ASSERT(4 == ArrayLength(arr)); // special case of one of the others
}

In these situations you should continue to use NS_ARRAY_LENGTH (or in SpiderMonkey, JS_ARRAY_LENGTH).

mozilla/Util.h is fragile with respect to IPDL headers, for include order

mozilla/Util.h includes mozilla/Types.h, which includes jstypes.h, which includes jsotypes.h, which defines certain fixed-width integer types: int8, int16, uint8, uint16, and so on. It happens that ipc/chromium/src/base/basictypes.h also defines these integer types — but incompatibly on Windows only. This header is, alas, included through every IPDL-generated header. In order to safely include any mfbt header in a file which also includes an IPDL-generated header, you must include the IPDL-generated header first. So when landing patches using mozilla/Util.h, watch out for Windows-specific bustage.

Removing the limitations

The limitations on the type of elements in arrays passed to ArrayLength are unavoidable limitations of C++. But C++11 removes these limitations, and compilers will likely implement support fairly quickly. When that happens we’ll be able to stop caring about the local-or-unnamed problem, not even needing to work around it.

The compile-time-constant limitation is likewise a limitation of C++. It too will go away in C++11 with the constexpr keyword. This modifier specifies that a function provided constant arguments computes a constant. The compiler must allow calls to the function that have constant arguments to be used as compile-time constants. Thus when compilers support constexpr, we can add it to the declaration of ArrayLength and begin using ArrayLength in compile-time-constant contexts. This is more low-hanging C++11 fruit that compilers will pick up soon. (Indeed, GCC 4.6 already implements it.)

Last, we have the Windows-specific #include ordering requirement. We have some ideas for getting around this problem, and we hope to have a solution soon.

A gotcha

Both these methods have a small gotcha: their behavior may not be intuitive when applied to C strings. What does sizeof("foo") evaluate to? If you think of "foo" as a string, you might say 3. But in reality "foo" is much better thought of as an array — and strings are '\0'-terminated. So actually, sizeof("foo") == 4. This was the case with NS_ARRAY_LENGTH, too, so it’s not new behavior. But if you use these methods without considering this, you might end up misusing them.

Conclusion

Avoid NS_ARRAY_LENGTH when possible, and use mozilla::ArrayLength or mozilla::ArrayEnd instead. And watch out when using them on strings, because their behavior might not be what you wanted.

(Curious how these methods are defined, and what C++ magic is used? See my next post.)

« Newer