26.12.11

Introducing mozilla/Assertions.h to mfbt

Recently I landed changes to the Mozilla Framework Based on Templates (mfbt) implementing Assertions.h, the new canonical assertions implementation in C/C++ Mozilla code.

Runtime assertions

Using assertions, a developer can efficiently detect when his code goes awry because internal invariants were broken. Mozilla has many assertion facilities. NS_ASSERTION is the oldest, but unfortunately it can be ignored, and therefore historically has been. We later introduced NS_ABORT_IF_FALSE as as an actual assertion that fails hard, and it’s now widely used. But it’s quite unwieldy, and it can’t be used by code that doesn’t want to depend on XPCOM. (Who would?)

mfbt addresses latent concerns with existing runtime assertions by introducing MOZ_ASSERT, MOZ_ASSERT_IF, MOZ_ALWAYS_TRUE, MOZ_ALWAYS_FALSE, and MOZ_NOT_REACHED macros to make performing true assertions dead simple.

MOZ_ASSERT(expr) and MOZ_ASSERT_IF(ifexpr, expr)

MOZ_ASSERT(expr) is straightforward: pass an expression as its sole argument, and in debug builds, if that expression is falsy, the assertion fails and execution halts in a debuggable way.

#include "mozilla/Assertions.h"

void frobnicate(Thing* thing)
{
  MOZ_ASSERT(thing);
  thing->frob();
}

MOZ_ASSERT_IF(ifexpr, expr) addresses the case when you want to assert something, where the check to decide whether to assert isn’t otherwise needed. You’d rather not muddy up your code by adding an #ifdef and if statement around your assertion. (MOZ_ASSERT(!ifexpr || expr) is a workaround, but it’s not very readable.) SpiderMonkey’s experience suggests Mozilla code will get good mileage from MOZ_ASSERT_IF.

#include "mozilla/Assertions.h"

class Error
{
    const char* optionalDescription;

  public:
    /* If specified, desc must not be empty. */
    Error(const char* desc = NULL)
    {
      MOZ_ASSERT_IF(desc != NULL, desc[0] != '\0');
      optionalDescription = desc;
    }
};

MOZ_ALWAYS_TRUE(expr) and MOZ_ALWAYS_FALSE(expr)

Sometimes the expression for an assertion must always be executed for its side effects, and it can’t just be executed in debug builds. MOZ_ALWAYS_TRUE(expr) and MOZ_ALWAYS_FALSE(expr) support this idiom. These macros always evaluate their argument, and in debug builds that argument is asserted truthy or falsy.

#include "mozilla/Assertions.h"

/* JS_ValueToBoolean was fallible but no longer is. */
MOZ_ALWAYS_TRUE(JS_ValueToBoolean(cx, v, &b));

MOZ_NOT_REACHED(reason)

MOZ_NOT_REACHED(reason) indicates that the given point can’t be reached during execution: simply hitting it is a bug. (Think of it as a more-explicit form of asserting false.) It takes as an argument an explanation of why that point shouldn’t have been reachable.

#include "mozilla/Assertions.h"

// ...in a language parser...
void handle(BooleanLiteralNode node)
{
  if (node.isTrue())
    handleTrueLiteral();
  else if (node.isFalse())
    handleFalseLiteral();
  else
    MOZ_NOT_REACHED("boolean literal that's not true or false?");
}

Compile-time assertions

Most assertions must happen at runtime. But some assertions are static, depending only on constants, and could be checked during compilation. A compile time check is better than a runtime check: the developer need not ensure a test exercises that code, because the compiler itself enforces the assertion. Properly crafted static assertions can never be unwittingly broken.

MOZ_STATIC_ASSERT(cond, reason)

MOZ_STATIC_ASSERT(cond, reason) asserts a condition at compile time. In newer compilers, if the assertion fails, the compiler will also include reason in diagnostics.

#include "mozilla/Assertions.h"

struct S { ... };
MOZ_STATIC_ASSERT(sizeof(S) % sizeof(size_t) == 0,
                  "S should be a multiple of word size for efficiency");

MOZ_STATIC_ASSERT is implemented with an impressive pile of hacks which should work perfectly everywhere — except, rarely, gcc 4.2 (the current OS X compiler) when compiling C++ code. The failure mode requires MOZ_STATIC_ASSERT on line N not in an extern "C" code block and a second MOZ_STATIC_ASSERT on the same line N (in a different file) in an extern "C" block. And those two files have to be used when compiling a single file, with the extern "C"‘d assertion second. This seems improbable, so we’ll risk it til we drop gcc 4.2 support.

Possible improvements

Assertions.h is reasonably complete, but I have a few ideas I’ve been considering for improvements. Let me know what you think of them in comments.

Add an optional reason argument to MOZ_ASSERT, and maybe to MOZ_ALWAYS_TRUE and MOZ_ALWAYS_FALSE

MOZ_ASSERT takes only the condition to test as an argument. In contrast NS_ASSERTION and NS_ABORT_IF_FALSE take the condition and an explanatory string. MOZ_ASSERT‘s lack of explanation derives purely from its ancestry in the JS_ASSERT macro: it wasn’t deliberate.

Would it be helpful if MOZ_ASSERT, MOZ_ALWAYS_TRUE, and MOZ_ALWAYS_FALSE optionally took a reason? (Optional because some assertions, e.g. many non-null assertions, are self-explanatory.) We’d have to disable assertions for compilers not implementing variadic macros (I think our supported compilers implement them), or possibly lose the condition expression in the assertion failure message. A reason would make it easier to convert existing NS_ABORT_IF_FALSEs to MOZ_ASSERTs. Should we add an optional second argument to MOZ_ASSERT and the others?

Include __assume(0) or __builtin_unreachable() in MOZ_NOT_REACHED

__builtin_unreachable() and __assume(0) inform the compiler that subsequent code can’t be executed, providing optimization opportunities. It’s unclear how this affects debugging feedback like stack traces. If the optimizations destroy Breakpad-ability, that may be too big a loss. More research is needed here.

Another possibility might be to use __builtin_trap(). This may not communicate an optimization opportunity comparable to that provided by the other two options. (It can’t be equally informative because execution must be able to continue past a trap. Thus the two have different impacts on variable lifetimes. Whether __builtin_trap otherwise communicates “unlikelihood” well enough isn’t clear.) Perhaps __builtin_trap could be used in debug builds, and __builtin_unreachable could be used in optimized builds. Again: more research needed.

Use C11’s _Static_assert in MOZ_STATIC_ASSERT

New editions of C and C++ include built-in static assertion syntax. MOZ_STATIC_ASSERT expands to C++11’s static_assert(2 + 2 == 4, "ya rly") syntax when possible. It could be made to expand to C11’s _Static_assert('A' == 'A', "no wai") syntax in some other cases, but frankly I don’t hack enough C code to care as long as the static assertion actually happens. 🙂 This is bug 713531 if you’re interested in investigating.

Want more information?

Read Assertions.h. mfbt code has a high standard for code comments in interface descriptions, and for file names (the current Util.h being a notable exception which will be fixed). We want it to be reasonably obvious where to find what you need and how to use it by skimming mfbt/‘s contents and then skimming a few files’ contents. Good comments are key to that. You should find Assertions.h quite readable; please file a bug if you have improvements to suggest.

15.12.11

Seen during a recent compile of mozilla-central with Clang, offered without comment

TestStartupCache.cpp
/home/jwalden/moz/inttypes/startupcache/test/TestStartupCache.cpp:119:15: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
  char* buf = "Market opportunities for BeardBook";
              ^
/home/jwalden/moz/inttypes/startupcache/test/TestStartupCache.cpp:120:14: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
  char* id = "id";
             ^
/home/jwalden/moz/inttypes/startupcache/test/TestStartupCache.cpp:148:15: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
  char* buf = "BeardBook competitive analysis";
              ^
/home/jwalden/moz/inttypes/startupcache/test/TestStartupCache.cpp:149:14: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
  char* id = "id";
             ^
/home/jwalden/moz/inttypes/startupcache/test/TestStartupCache.cpp:197:14: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
  char* id = "id";
             ^
/home/jwalden/moz/inttypes/startupcache/test/TestStartupCache.cpp:288:15: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
  char* buf = "Find your soul beardmate on BeardBook";
              ^
/home/jwalden/moz/inttypes/startupcache/test/TestStartupCache.cpp:289:14: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
  char* id = "id";
             ^
/home/jwalden/moz/inttypes/startupcache/test/TestStartupCache.cpp:323:12: warning: unused variable 'rv2' [-Wunused-variable]
  nsresult rv2;
           ^
8 warnings generated.

08.12.11

Party like it’s 1999: <stdint.h> comes to Mozilla!

tl;dr

Need an integer type with guaranteed size? If you’re not defining a cross-file interface, #include "mozilla/StdInt.h"#include "mozilla/StandardInteger.h" and use uint32_t or any other type defined by <stdint.h>. (If you are defining an interface, use PRUint32 and similar — for now.) mozilla/StdInt.hmozilla/StandardInteger.h is a cross-platform implementation of <stdint.h>‘s functionality usable in any code.

Embedders may find that the mozilla/StdInt.hmozilla/StandardInteger.h typedefs conflict with ones they have already been using. To work around this conflict, write a stdint.h compatible with the embedding’s typedefs (more likely: adapt an existing implementation), then set the preprocessor variable MOZ_CUSTOM_STDINT_H to a quoted path to that reimplementation when mozilla/StdInt.hmozilla/StandardInteger.h is included. It may be simplest to add this to command line flags when invoking the compiler.

Fixed-size integer types

Fixed-size integer types are signed or unsigned types with exactly N bits. They contrast with the built-in C and C++ types (char, short, int, &c.) with compiler-dependent sizes. Fixed-size integer types are quite useful:

  • They work well when serializing an object to a sequence of bytes, where the size of a serialized item must be constant for correctness.
  • They minimize memory use in classes and structs, also making padding-based waste more obvious.
  • They work well in cross-platform APIs, eliminating the challenge of implementing correct behavior when types have different sizes across platforms.

Fixed-size integer types are useful in the same way size_t, off_t, and other non-built-in types are: they fit some problem domains better than built-in types.

C99 and C++11 finally standardized fixed-size integer types in <stdint.h> and <cstdint>. They define {u,}int{8,16,32,64}_t types, plus useful constants for their limits (INT8_MIN, INT8_MAX, UINT8_MAX, INT16_MIN, &c.). One would expect projects to quickly use these types, but it didn’t happen.

Old projects predating <stdint.h> have been particularly slow to adopt it. Many such projects already rolled their own non-<stdint.h>-named fixed-size integer types; switching would be a hassle. And not all compilers shipped <stdint.h>: Visual Studio didn’t have it until 2010! (ಠ_ಠ) Projects implementing their own <stdint.h>-compatible types posed another problem, because different projects’ implementations might be incompatible.

New projects fare better, but not always. Sometimes their dependence on old projects anchors them to the old, pre-<stdint.h> world.

Fixed-size integer types in Mozilla

Mozilla sits squarely in the old-project category, facing every rationale noted above for using its own types. These have long been NSPR‘s PR{Ui,I}nt{8,16,32,64} and SpiderMonkey’s {u,}int{8,16,32,64} types. But recently the landscape has changed.

As Mozilla has imported more external code, fixed-size integer types have proliferated. Most imported code uses <stdint.h>, with fallback typedefs for Visual Studio; some code (IPC code from Chromium) defines and uses {u,}int{8,16,32,64}. As type definitions have proliferated, surprising problems have arisen.

The woes of multiplicity

#include-order issues are the simplest problem. For example, the IPC uint32-style definitions are incompatible with SpiderMonkey’s definitions with some compilers. #include IPC and SpiderMonkey headers in the wrong order, and the compiler will error on incompatible typedefs. This problem is easy to diagnose but harder to resolve, and sometimes it causes considerable pain. Mass refactorings that add #includes have fallen afoul of this, with the least bad solution usually being to fix the first error, recompile, and repeat until done (twenty-odd times in one instance).

Worse than #include mis-ordering and conflict are linking problems. int and long could be 32-bit integers yet appear different when linked. Suppose a method taking an int32 argument defined int32 = int during compilation, but a user of it saw int32 = long during compilation. Each alone would compile. But beneath typedefs they’d be incompatible and wouldn’t link.

As we’ve imported more code in Mozilla, more and more developers have been bitten by these problems. We’ve reached a breaking point. We could use PRUint32, JSUint32, and other types which never trample upon each other. Yet no one likes them given the standardized types, and it’s not possible to change “upstream” code to such a scheme. Thus a second solution: use <stdint.h> definitions for everything.

Switching to <stdint.h>

Using the <stdint.h> types in Mozilla code is now as simple as #include "mozilla/StdInt.h"#include "mozilla/StandardInteger.h". mozilla/StdInt.hmozilla/StandardInteger.h implements the <stdint.h> interface even in the edge cases: for compilers not supporting it, and for embedders who can’t use the regular definitions. It works as follows:

  1. If the preprocessor definition MOZ_CUSTOM_STDINT_H is defined, then #include MOZ_CUSTOM_STDINT_H. Embedders who can’t use the default definitions should use this to adapt. (MOZ_CUSTOM_STDINT_H may also be passed into the Mozilla build system using an environment variable. Note that while the preprocessor definition must be a quoted path, the environment variable must be an unquoted path.)
  2. Otherwise, if the compiler doesn’t provide <stdint.h>, use a custom implementation. This is currently limited to Visual Studio prior to 2010, using an implementation imported from msinttypes on Google Code.
  3. Otherwise use <stdint.h>.

We’re only providing these types now, but shortly we’ll start switching code using non-<stdint.h> fixed-size integer types to use them. Adding mozilla/StdInt.hmozilla/StandardInteger.h is merely the first step toward removing the other fixed-size integer types (except when they’re necessary to interact with external libraries).

Conclusion

It’s been a dozen years since <stdint.h> was standardized. Now it finally comes to Mozilla. Let’s lower a barrier to hackability in Mozilla and start using it.

26.11.11

Introducing MOZ_FINAL: prevent inheriting from a class, or prevent overriding a virtual function

Tags: , , , , , , , — Jeff @ 09:17

The inexorable march of progress continues in the Mozilla Framework Based on Templates with the addition of MOZ_FINAL, through which you can limit various forms of inheritance in C++.

Traditional C++ inheritance mostly can’t be controlled

In C++98 any class can be inherited. (An extremely obscure pattern will prevent this, but it has down sides.) Sometimes this makes sense: it’s natural to subclass an abstract List class as LinkedList, or LinkedList as CircularLinkedList. But sometimes this doesn’t make sense. StringBuilder certainly could inherit from Vector<char>, but doing so might expose many Vector methods that don’t belong in the StringBuilder concept. It would be more sensible for StringBuilder to contain a private Vector<char> which StringBuilder member methods manipulated. Preventing Vector from being used as a base class would be one way (not necessarily the best one) to avoid this conceptual error. But C++98 doesn’t let you easily do that.

Even when inheritance is desired, sometimes you don’t want completely-virtual methods. Sometimes you’d like a class to implement a virtual method (virtual perhaps because its base declared it so) which derived classes can’t override. Perhaps you want to rely on that method being implemented only by your base class, or perhaps you want it “fixed” as an optimization. Again in C++98, you can’t do this: public virtual functions are overridable.

C++11’s contextual final keyword

C++11 introduces a contextual final keyword for these purposes. To prevent a class from being inheritable, add final to its definition just after the class name (the class can’t be unnamed).

struct Base1 final
{
  virtual void foo();
};

// ERROR: can't inherit from B.
struct Derived1 : public Base1 { };

struct Base2 { };

// Derived classes can be final too.
struct Derived2 final : public Base2 { };

Similarly, a virtual member function can be marked as not overridable by placing the contextual final keyword at the end of its declaration, before a terminating semicolon, body, or = 0.

struct Base
{
  virtual void foo() final;
};

struct Derived : public Base
{
  // ERROR: Base::foo was final.
  virtual void foo() { }
};

Introducing MOZ_FINAL

mfbt now includes support for marking classes and virtual member functions as final using the MOZ_FINAL macro in mozilla/Attributes.h. Simply place it in the same position as final would occur in the C++11 syntax:

#include "mozilla/Attributes.h"

class Base
{
  public:
    virtual void foo();
};

class Derived final : public Base
{
  public:
    /*
     * MOZ_FINAL and MOZ_OVERRIDE are composable; as a matter of
     * style, they should appear in the order MOZ_FINAL MOZ_OVERRIDE,
     * not the other way around.
     */
    virtual void foo() MOZ_FINAL MOZ_OVERRIDE { }
    virtual void bar() MOZ_FINAL;
    virtual void baz() MOZ_FINAL = 0;
};

MOZ_FINAL expands to the C++11 syntax or its compiler-specific equivalent whenever possible, turning violations of final semantics into compile-time errors. The same compilers that usefully expand MOZ_OVERRIDE also usefully expand MOZ_FINAL, so misuse will be quickly noted.

One interesting use for MOZ_FINAL is to tell the compiler that one worrisome C++ trick sometimes isn’t. This is the virtual-methods-without-virtual-destructor trick. It’s used when a class must have virtual functions, but code doesn’t want to pay the price of destruction having virtual-call overhead.

class Base
{
  public:
    virtual void method() { }
    ~Base() { }
};

void destroy(Base* b)
{
  delete b; // may cause a warning
}

Some compilers warn when they instantiate a class with virtual methods but without a virtual destructor. Other compilers only emit this warning when a pointer to a class instance is deleted. The reason is that in C++, behavior is undefined if the static type of the instance being deleted isn’t the same as its runtime type and its destructor isn’t virtual. In other words, if ~Base() is non-virtual, destroy(new Base) is perfectly fine, but destroy(new DerivedFromBase) is not. The warning makes sense if destruction might miss a base class — but if the class is marked final, it never will! Clang silences its warning if the class is final, and I hope that MSVC will shortly do the same.

What about NS_FINAL_CLASS?

As with MOZ_OVERRIDE we had a gunky XPCOM static-analysis NS_FINAL_CLASS macro for final classes. (We had no equivalent for final methods.) NS_FINAL_CLASS too was misplaced as far as C++11 syntax was concerned, and it too has been deprecated. Almost all uses of NS_FINAL_CLASS have now been removed (the one remaining use I’ve left for the moment due to an apparent Clang bug I haven’t tracked down yet), and it shouldn’t be used.

(Side note: In replacing NS_FINAL_CLASS with MOZ_FINAL, I discovered that some of the existing annotations have been buggy for months! Clearly no one’s done static analysis builds in awhile. The moral of the story: compiler static analyses that happen for every single build are vastly superior to user static analyses that happen only in special builds.)

Summary

If you don’t want a class to be inheritable, add MOZ_FINAL to its definition after the class name. If you don’t want a virtual member function to be overridden in derived classes, add MOZ_FINAL at the end of its declaration. Some compilers will then enforce your wishes, and you can rely on these requirements rather than hope for the best.

16.11.11

Introducing MOZ_OVERRIDE to annotate virtual functions which override base-class virtual functions

Tags: , , , , , , , — Jeff @ 09:55

Overriding inherited virtual functions

One way C++ supports code reuse is through inheritance. One base class implements common functionality. Then other classes inherit from it, essentially copying functionality from it. These other classes can add their own new functionality, or, more powerfully, they can override the base class functionality.

class Base
{
  public:
    virtual const char* type() { return "Base"; }
};
class Derived : public Base
{
  public:
    virtual const char* type() { return "Derived"; }
};

Overriding base class functionality is simple. Keeping such overrides working correctly is sometimes harder. The problem is that the override relationship is implicit: if the override doesn’t exactly match the signature of the desired function in the base class, it may not work correctly.

class Base
{
  public:
    // Perhaps as part of an incomplete refactoring,
    // the base class's function changed its name.
    virtual const char* kind() { return "Base"; }
};
class DerivedIncorrectly : public Base
{
  public:
    virtual const char* type() { return "Derived"; }
};

// BAD: code expecting kind() to work and sometimes
// indicate Derived-ness no longer will.

Making the override relationship explicit

Some languages (Scala, C#, probably others) provide the ability to mark a derived class’s function as an override of an inherited function. C++98 included no such ability, but C++11 does, through the contextual override keyword. When override is used, that virtual member function must override one found on a base class. If it does not, it is a compile error.

class Base
{
  public:
    virtual const char* kind() { return "Base"; }
};
class DerivedIncorrectly : public Base
{
  public:
    // This will cause a compile error: there's no type()
    // method on Base that this overrides.
    virtual const char* type() override { return "Derived"; }

    // This will work as intended.
    virtual const char* kind() override { return "Derived"; }
};

Introducing MOZ_OVERRIDE

The Mozilla Framework Based on Templates now includes support for the C++11 contextual override keyword, encapsulated in the MOZ_OVERRIDE macro in mozilla/Types.hmozilla/Attributes.h. Simply place it at the end of the declaration of the relevant method, before any = 0 or method body, like so:

#include "mozilla/Types.h" // MOZ_OVERRIDE has since moved...
#include "mozilla/Attributes.h" // ...to here

class Base
{
  public:
    virtual void f() = 0;
};
class Derived1 : public Base
{
  public:
    virtual void f() MOZ_OVERRIDE;
};
class Derived2 : public Base
{
  public:
    virtual void f() MOZ_OVERRIDE = 0;
};
class Derived3 : public Base
{
  public:
    virtual void f() MOZ_OVERRIDE { }
};

MOZ_OVERRIDE will expand to use the C++11 construct in compilers which support it. Thus in such compilers misuse of MOZ_OVERRIDE is an error. Even better, some of the compilers used by tinderbox support override, so in many cases tinderbox will detect misuse. (Specifically, MSVC++ 2005 and later support it, so errors in cross-platform and Windows code won’t pass tinderbox . Much more recent versions of GCC and Clang support it as well, but these versions are too new for tinderbox to have picked them up yet — in the case of GCC too new to even have been released yet. 🙂 )

What about NS_OVERRIDE?

It turns out there’s already a macro annotation to indicate an override relationship: NS_OVERRIDE. This gunky XPCOM macro expands to a user attribute under gcc-like compilers. It’s only used by static analysis right now, so its value is limited. Unfortunately its position is different — necessarily so, because in the C++11 override position it would attach to the return value of the method:

class OldAndBustedDerived : public Base
{
  public:
    NS_OVERRIDE virtual void f(); // annotates the method
    __attribute__(...) virtual void g(); // its expansion
};
class Derived2 : public Base
{
  public:
    // But in the MOZ_OVERRIDE position, it would annotate
    // f()'s return value.
    virtual void f() __attribute__(...);
};

NS_OVERRIDE is now deprecated and should be replaced with MOZ_OVERRIDE. With a little work, static analysis with new-enough compilers can likely look for MOZ_OVERRIDE just as easily as for NS_OVERRIDE. And since MOZ_OVERRIDE works in non-static analysis builds, it’s arguably better in the majority of cases anyway. If you’re looking for an easy way to improve Mozilla code, changing NS_OVERRIDE uses to use MOZ_OVERRIDE would be a simple way to help.

Summary

If you’ve overridden an inherited virtual member function and you’re worried that that override might silently break at some point, annotate your override with MOZ_OVERRIDE. This will cause some compilers to enforce an override relationship, making it much less likely that your intended relationship will break.

Older »