21.05.18

PSA: stop using mozilla::PodZero and mozilla::PodArrayZero

I’ve blogged about surprising bits of the C++ object model before, and I’m back with more.

Executive summary: Don’t use mozilla::PodZero or mozilla::PodArrayZero. Modern C++ provides better alternatives that don’t presume that writing all zeroes will always correctly initialize the given type. Use constructors, in-class member initializers, and functions like std::fill to zero member fields.

The briefest recap of salient parts of the C++ object model

C++ as a language really wants to know when objects are created so that compilers can know that this memory contains an object of this type. Compilers then can assume that writing an object of one type, won’t conflict with reads/writes of incompatible types.

double foo(double* d, int* i, int z)
{
  *d = 3.14;

  // int/double are incompatible, so this write may be
  // assumed not to change the value of *d.
  *i = z;

  // Therefore *d may be assumed to still be 3.14, so this
  // may be compiled as 3.14 * z without rereading *d.
  return *d * z;
}

You can’t use arbitrary memory as your desired type after a cast. An object of that type must have been explicitly created there: e.g. a local variable of that type must be declared there, a field of that type must be defined and the containing object created, the object must be created via new, &c.

Misinterpreting an object using an incompatible type violates the strict aliasing rules in [basic.lval]p11.

memsetting an object

memset lets you write characters over memory. C code routinely used this to fill an array or struct with zeroes or null pointers or similar, assuming all-zeroes writes the right value.

C++ code also sometimes uses memset to zero out an object, either after allocating its memory or in the constructor. This doesn’t create a T (you’d need to placement-new), but it often still “works”. But what if T changes to require initialization? Maybe a field in T gains a constructor (T might never be touched!) or a nonzero initializer, making T a non-trivial type. memset could hide that fresh initialization requirement or (depending when the memset happens) overwrite a necessary initialization.

Problem intensifies

Unfortunately, Mozilla code has provided and promoted a PodZero function that misuses memset this way. So when I built with gcc 8.0 recently (I usually use a home-built clang), I discovered a torrent of build warnings about memset misuse on non-trivial types. A taste:

In file included from /home/jwalden/moz/after/js/src/jit/BitSet.h:12,
                 from /home/jwalden/moz/after/js/src/jit/Safepoints.h:10,
                 from /home/jwalden/moz/after/js/src/jit/JitFrames.h:13,
                 from /home/jwalden/moz/after/js/src/jit/BaselineFrame.h:10,
                 from /home/jwalden/moz/after/js/src/vm/Stack-inl.h:15,
                 from /home/jwalden/moz/after/js/src/vm/Debugger-inl.h:12,
                 from /home/jwalden/moz/after/js/src/vm/DebuggerMemory.cpp:29,
                 from /home/jwalden/moz/after/js/src/dbg/js/src/Unified_cpp_js_src32.cpp:2:
/home/jwalden/moz/after/js/src/jit/JitAllocPolicy.h: In instantiation of ‘T* js::jit::JitAllocPolicy::maybe_pod_calloc(size_t) [with T = js::detail::HashTableEntry<js::HashMapEntry<JS::Value, unsigned int> >; size_t = long unsigned int]’:
/home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:1293:63:   required from ‘static js::detail::HashTable<T, HashPolicy, AllocPolicy>::Entry* js::detail::HashTable<T, HashPolicy, AllocPolicy>::createTable(AllocPolicy&, uint32_t, js::detail::HashTable<T, HashPolicy, AllocPolicy>::FailureBehavior) [with T = js::HashMapEntry<JS::Value, unsigned int>; HashPolicy = js::HashMap<JS::Value, unsigned int, js::jit::LIRGraph::ValueHasher, js::jit::JitAllocPolicy>::MapHashPolicy; AllocPolicy = js::jit::JitAllocPolicy; js::detail::HashTable<T, HashPolicy, AllocPolicy>::Entry = js::detail::HashTableEntry<js::HashMapEntry<JS::Value, unsigned int> >; uint32_t = unsigned int]’
/home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:1361:28:   required from ‘bool js::detail::HashTable<T, HashPolicy, AllocPolicy>::init(uint32_t) [with T = js::HashMapEntry<JS::Value, unsigned int>; HashPolicy = js::HashMap<JS::Value, unsigned int, js::jit::LIRGraph::ValueHasher, js::jit::JitAllocPolicy>::MapHashPolicy; AllocPolicy = js::jit::JitAllocPolicy; uint32_t = unsigned int]’
/home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:92:69:   required from ‘bool js::HashMap<Key, Value, HashPolicy, AllocPolicy>::init(uint32_t) [with Key = JS::Value; Value = unsigned int; HashPolicy = js::jit::LIRGraph::ValueHasher; AllocPolicy = js::jit::JitAllocPolicy; uint32_t = unsigned int]’
/home/jwalden/moz/after/js/src/jit/LIR.h:1901:38:   required from here
/home/jwalden/moz/after/js/src/jit/JitAllocPolicy.h:101:19: warning: ‘void* memset(void*, int, size_t)’ clearing an object of type ‘class js::detail::HashTableEntry<js::HashMapEntry<JS::Value, unsigned int> >’ with no trivial copy-assignment [-Wclass-memaccess]
             memset(p, 0, numElems * sizeof(T));
             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/jwalden/moz/after/js/src/dbg/dist/include/js/TracingAPI.h:11,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/GCPolicyAPI.h:47,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/RootingAPI.h:22,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/CallArgs.h:73,
                 from /home/jwalden/moz/after/js/src/jsapi.h:29,
                 from /home/jwalden/moz/after/js/src/vm/DebuggerMemory.h:10,
                 from /home/jwalden/moz/after/js/src/vm/DebuggerMemory.cpp:7,
                 from /home/jwalden/moz/after/js/src/dbg/js/src/Unified_cpp_js_src32.cpp:2:
/home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:794:7: note: ‘class js::detail::HashTableEntry<js::HashMapEntry<JS::Value, unsigned int> >’ declared here
 class HashTableEntry
       ^~~~~~~~~~~~~~
Unified_cpp_js_src36.o
In file included from /home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:19,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/TracingAPI.h:11,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/GCPolicyAPI.h:47,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/RootingAPI.h:22,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/CallArgs.h:73,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/CallNonGenericMethod.h:12,
                 from /home/jwalden/moz/after/js/src/NamespaceImports.h:15,
                 from /home/jwalden/moz/after/js/src/gc/Barrier.h:10,
                 from /home/jwalden/moz/after/js/src/vm/ArgumentsObject.h:12,
                 from /home/jwalden/moz/after/js/src/vm/GeneratorObject.h:10,
                 from /home/jwalden/moz/after/js/src/vm/GeneratorObject.cpp:7,
                 from /home/jwalden/moz/after/js/src/dbg/js/src/Unified_cpp_js_src33.cpp:2:
/home/jwalden/moz/after/js/src/dbg/dist/include/mozilla/PodOperations.h: In instantiation of ‘void mozilla::PodZero(T*) [with T = js::NativeIterator]’:
/home/jwalden/moz/after/js/src/vm/Iteration.cpp:578:15:   required from here
/home/jwalden/moz/after/js/src/dbg/dist/include/mozilla/PodOperations.h:32:9: warning: ‘void* memset(void*, int, size_t)’ clearing an object of type ‘struct js::NativeIterator’ with no trivial copy-assignment; use assignment or value-initialization instead [-Wclass-memaccess]
   memset(aT, 0, sizeof(T));
   ~~~~~~^~~~~~~~~~~~~~~~~~
In file included from /home/jwalden/moz/after/js/src/vm/JSCompartment-inl.h:14,
                 from /home/jwalden/moz/after/js/src/vm/JSObject-inl.h:32,
                 from /home/jwalden/moz/after/js/src/vm/ArrayObject-inl.h:15,
                 from /home/jwalden/moz/after/js/src/vm/GeneratorObject.cpp:11,
                 from /home/jwalden/moz/after/js/src/dbg/js/src/Unified_cpp_js_src33.cpp:2:
/home/jwalden/moz/after/js/src/vm/Iteration.h:32:8: note: ‘struct js::NativeIterator’ declared here
 struct NativeIterator
        ^~~~~~~~~~~~~~

Fixing the problem by not using mozilla::PodZero

Historically you’d have to add every single member-initialization to your constructor, duplicating names and risking missing one, but C+11’s in-class initializers allow an elegant fix:

// Add " = nullptr" to initialize these function pointers.
struct AsmJSCacheOps
{
    OpenAsmJSCacheEntryForReadOp openEntryForRead = nullptr;
    CloseAsmJSCacheEntryForReadOp closeEntryForRead = nullptr;
    OpenAsmJSCacheEntryForWriteOp openEntryForWrite = nullptr;
    CloseAsmJSCacheEntryForWriteOp closeEntryForWrite = nullptr;
};

As long as you invoke a constructor, the members will be initialized. (Constructors can initialize a member to override in-class initializers.)

List-initialization using {} is also frequently helpful: you can use it to zero trailing (or all) members of an array or struct without naming/providing them:

class PreliminaryObjectArray
{
  public:
    static const uint32_t COUNT = 20;

  private:
    // All objects with the type which have been allocated. The pointers in
    // this array are weak.
    JSObject* objects[COUNT] = {}; // zeroes

  public:
    PreliminaryObjectArray() = default;

    // ...
};

Finally, C++ offers iterative-mutation functions to fill a container:

#include <algorithm>

// mozilla::Array's default constructor doesn't initialize array
// contents unless the element type is a class with a default
// constructor, and no Array overload exists to zero every
// element.  (You could pass 1024 zeroes, but....)
mozilla::Array<uint32_t, 1024> page; // array contents undefined

std::fill(page.begin(), page.end(), 0); // now contains zeroes
std::fill_n(page.begin(), page.end() - page.begin(), 0); // alternatively

After a long run of fixes to sundry bits of SpiderMonkey code to fix every last one of these issues last week, I’ve returned SpiderMonkey to warning-free with gcc (excluding imported ICU code). The only serious trickiness I ran into was a function of very unusual SpiderMonkey needs that shouldn’t affect code generally.

Fixing these issues is generally very doable. As people update to newer and newer gcc to build, the new -Wclass-memaccess warning that told me about these issues will bug more and more people, and I’m confident all these problems triggered by PodZero can be fixed.

mozilla::PodZero and mozilla::PodArrayZero are deprecated

PodZero and its array-zeroing variant PodArrayZero are ill-fitted to modern C++ and modern compilers. C++ now offers clean, type-safe ways to initialize memory to zeroes. You should avoid using PodZero and PodArrayZero in new code, replacing it with the initializer syntaxes mentioned above or with standard C++ algorithms to fill in zeroes.

As PodZero is used in a ton of places right now, it’ll likely stick around for some time. But there’s a good chance I’ll rename it to DeprecatedPodZero to highlight its badness and the desire to remove it. You should replace existing uses of it wherever and whenever you can.