21.05.18 PSA: stop using `mozilla::PodZero` and `mozilla::PodArrayZero`

Tags: c++, c++11, compilers, list-initialization, memset, mozilla, pod, podzero — Jeff @ 13:08

I’ve blogged about surprising bits of the C++ object model before, and I’m back with more.

Executive summary: Don’t use mozilla::PodZero or mozilla::PodArrayZero. Modern C++ provides better alternatives that don’t presume that writing all zeroes will always correctly initialize the given type. Use constructors, in-class member initializers, and functions like std::fill to zero member fields.

The briefest recap of salient parts of the C++ object model

C++ as a language really wants to know when objects are created so that compilers can know that this memory contains an object of this type. Compilers then can assume that writing an object of one type, won’t conflict with reads/writes of incompatible types.

double foo(double* d, int* i, int z)
{
  *d = 3.14;

  // int/double are incompatible, so this write may be
  // assumed not to change the value of *d.
  *i = z;

  // Therefore *d may be assumed to still be 3.14, so this
  // may be compiled as 3.14 * z without rereading *d.
  return *d * z;
}

You can’t use arbitrary memory as your desired type after a cast. An object of that type must have been explicitly created there: e.g. a local variable of that type must be declared there, a field of that type must be defined and the containing object created, the object must be created via new, &c.

Misinterpreting an object using an incompatible type violates the strict aliasing rules in [basic.lval]p11.

`memset`ting an object

memset lets you write characters over memory. C code routinely used this to fill an array or struct with zeroes or null pointers or similar, assuming all-zeroes writes the right value.

C++ code also sometimes uses memset to zero out an object, either after allocating its memory or in the constructor. This doesn’t create a T (you’d need to placement-new), but it often still “works”. But what if T changes to require initialization? Maybe a field in T gains a constructor (T might never be touched!) or a nonzero initializer, making T a non-trivial type. memset could hide that fresh initialization requirement or (depending when the memset happens) overwrite a necessary initialization.

Problem intensifies

Unfortunately, Mozilla code has provided and promoted a PodZero function that misuses memset this way. So when I built with gcc 8.0 recently (I usually use a home-built clang), I discovered a torrent of build warnings about memset misuse on non-trivial types. A taste:

In file included from /home/jwalden/moz/after/js/src/jit/BitSet.h:12,
                 from /home/jwalden/moz/after/js/src/jit/Safepoints.h:10,
                 from /home/jwalden/moz/after/js/src/jit/JitFrames.h:13,
                 from /home/jwalden/moz/after/js/src/jit/BaselineFrame.h:10,
                 from /home/jwalden/moz/after/js/src/vm/Stack-inl.h:15,
                 from /home/jwalden/moz/after/js/src/vm/Debugger-inl.h:12,
                 from /home/jwalden/moz/after/js/src/vm/DebuggerMemory.cpp:29,
                 from /home/jwalden/moz/after/js/src/dbg/js/src/Unified_cpp_js_src32.cpp:2:
/home/jwalden/moz/after/js/src/jit/JitAllocPolicy.h: In instantiation of ‘T* js::jit::JitAllocPolicy::maybe_pod_calloc(size_t) [with T = js::detail::HashTableEntry<js::HashMapEntry<JS::Value, unsigned int> >; size_t = long unsigned int]’:
/home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:1293:63:   required from ‘static js::detail::HashTable<T, HashPolicy, AllocPolicy>::Entry* js::detail::HashTable<T, HashPolicy, AllocPolicy>::createTable(AllocPolicy&, uint32_t, js::detail::HashTable<T, HashPolicy, AllocPolicy>::FailureBehavior) [with T = js::HashMapEntry<JS::Value, unsigned int>; HashPolicy = js::HashMap<JS::Value, unsigned int, js::jit::LIRGraph::ValueHasher, js::jit::JitAllocPolicy>::MapHashPolicy; AllocPolicy = js::jit::JitAllocPolicy; js::detail::HashTable<T, HashPolicy, AllocPolicy>::Entry = js::detail::HashTableEntry<js::HashMapEntry<JS::Value, unsigned int> >; uint32_t = unsigned int]’
/home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:1361:28:   required from ‘bool js::detail::HashTable<T, HashPolicy, AllocPolicy>::init(uint32_t) [with T = js::HashMapEntry<JS::Value, unsigned int>; HashPolicy = js::HashMap<JS::Value, unsigned int, js::jit::LIRGraph::ValueHasher, js::jit::JitAllocPolicy>::MapHashPolicy; AllocPolicy = js::jit::JitAllocPolicy; uint32_t = unsigned int]’
/home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:92:69:   required from ‘bool js::HashMap<Key, Value, HashPolicy, AllocPolicy>::init(uint32_t) [with Key = JS::Value; Value = unsigned int; HashPolicy = js::jit::LIRGraph::ValueHasher; AllocPolicy = js::jit::JitAllocPolicy; uint32_t = unsigned int]’
/home/jwalden/moz/after/js/src/jit/LIR.h:1901:38:   required from here
/home/jwalden/moz/after/js/src/jit/JitAllocPolicy.h:101:19: warning: ‘void* memset(void*, int, size_t)’ clearing an object of type ‘class js::detail::HashTableEntry<js::HashMapEntry<JS::Value, unsigned int> >’ with no trivial copy-assignment [-Wclass-memaccess]
             memset(p, 0, numElems * sizeof(T));
             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/jwalden/moz/after/js/src/dbg/dist/include/js/TracingAPI.h:11,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/GCPolicyAPI.h:47,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/RootingAPI.h:22,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/CallArgs.h:73,
                 from /home/jwalden/moz/after/js/src/jsapi.h:29,
                 from /home/jwalden/moz/after/js/src/vm/DebuggerMemory.h:10,
                 from /home/jwalden/moz/after/js/src/vm/DebuggerMemory.cpp:7,
                 from /home/jwalden/moz/after/js/src/dbg/js/src/Unified_cpp_js_src32.cpp:2:
/home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:794:7: note: ‘class js::detail::HashTableEntry<js::HashMapEntry<JS::Value, unsigned int> >’ declared here
 class HashTableEntry
       ^~~~~~~~~~~~~~
Unified_cpp_js_src36.o
In file included from /home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:19,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/TracingAPI.h:11,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/GCPolicyAPI.h:47,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/RootingAPI.h:22,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/CallArgs.h:73,
                 from /home/jwalden/moz/after/js/src/dbg/dist/include/js/CallNonGenericMethod.h:12,
                 from /home/jwalden/moz/after/js/src/NamespaceImports.h:15,
                 from /home/jwalden/moz/after/js/src/gc/Barrier.h:10,
                 from /home/jwalden/moz/after/js/src/vm/ArgumentsObject.h:12,
                 from /home/jwalden/moz/after/js/src/vm/GeneratorObject.h:10,
                 from /home/jwalden/moz/after/js/src/vm/GeneratorObject.cpp:7,
                 from /home/jwalden/moz/after/js/src/dbg/js/src/Unified_cpp_js_src33.cpp:2:
/home/jwalden/moz/after/js/src/dbg/dist/include/mozilla/PodOperations.h: In instantiation of ‘void mozilla::PodZero(T*) [with T = js::NativeIterator]’:
/home/jwalden/moz/after/js/src/vm/Iteration.cpp:578:15:   required from here
/home/jwalden/moz/after/js/src/dbg/dist/include/mozilla/PodOperations.h:32:9: warning: ‘void* memset(void*, int, size_t)’ clearing an object of type ‘struct js::NativeIterator’ with no trivial copy-assignment; use assignment or value-initialization instead [-Wclass-memaccess]
   memset(aT, 0, sizeof(T));
   ~~~~~~^~~~~~~~~~~~~~~~~~
In file included from /home/jwalden/moz/after/js/src/vm/JSCompartment-inl.h:14,
                 from /home/jwalden/moz/after/js/src/vm/JSObject-inl.h:32,
                 from /home/jwalden/moz/after/js/src/vm/ArrayObject-inl.h:15,
                 from /home/jwalden/moz/after/js/src/vm/GeneratorObject.cpp:11,
                 from /home/jwalden/moz/after/js/src/dbg/js/src/Unified_cpp_js_src33.cpp:2:
/home/jwalden/moz/after/js/src/vm/Iteration.h:32:8: note: ‘struct js::NativeIterator’ declared here
 struct NativeIterator
        ^~~~~~~~~~~~~~

Fixing the problem by not using `mozilla::PodZero`

Historically you’d have to add every single member-initialization to your constructor, duplicating names and risking missing one, but C+11’s in-class initializers allow an elegant fix:

// Add " = nullptr" to initialize these function pointers.
struct AsmJSCacheOps
{
    OpenAsmJSCacheEntryForReadOp openEntryForRead = nullptr;
    CloseAsmJSCacheEntryForReadOp closeEntryForRead = nullptr;
    OpenAsmJSCacheEntryForWriteOp openEntryForWrite = nullptr;
    CloseAsmJSCacheEntryForWriteOp closeEntryForWrite = nullptr;
};

As long as you invoke a constructor, the members will be initialized. (Constructors can initialize a member to override in-class initializers.)

List-initialization using {} is also frequently helpful: you can use it to zero trailing (or all) members of an array or struct without naming/providing them:

class PreliminaryObjectArray
{
  public:
    static const uint32_t COUNT = 20;

  private:
    // All objects with the type which have been allocated. The pointers in
    // this array are weak.
    JSObject* objects[COUNT] = {}; // zeroes

  public:
    PreliminaryObjectArray() = default;

    // ...
};

Finally, C++ offers iterative-mutation functions to fill a container:

#include <algorithm>

// mozilla::Array's default constructor doesn't initialize array
// contents unless the element type is a class with a default
// constructor, and no Array overload exists to zero every
// element.  (You could pass 1024 zeroes, but....)
mozilla::Array<uint32_t, 1024> page; // array contents undefined

std::fill(page.begin(), page.end(), 0); // now contains zeroes
std::fill_n(page.begin(), page.end() - page.begin(), 0); // alternatively

After a long run of fixes to sundry bits of SpiderMonkey code to fix every last one of these issues last week, I’ve returned SpiderMonkey to warning-free with gcc (excluding imported ICU code). The only serious trickiness I ran into was a function of very unusual SpiderMonkey needs that shouldn’t affect code generally.

Fixing these issues is generally very doable. As people update to newer and newer gcc to build, the new -Wclass-memaccess warning that told me about these issues will bug more and more people, and I’m confident all these problems triggered by PodZero can be fixed.

`mozilla::PodZero` and `mozilla::PodArrayZero` are deprecated

PodZero and its array-zeroing variant PodArrayZero are ill-fitted to modern C++ and modern compilers. C++ now offers clean, type-safe ways to initialize memory to zeroes. You should avoid using PodZero and PodArrayZero in new code, replacing it with the initializer syntaxes mentioned above or with standard C++ algorithms to fill in zeroes.

As PodZero is used in a ton of places right now, it’ll likely stick around for some time. But there’s a good chance I’ll rename it to DeprecatedPodZero to highlight its badness and the desire to remove it. You should replace existing uses of it wherever and whenever you can.

Comments (0)

07.05.17 Back in a bit

Tags: leon trotsky, mozilla, pct, snow, vacation — Jeff @ 05:00

Woman with an anxious look on her face, sitting in a seat on an airplane -- from that great classic, Airplane! — I’ve gotta get out of here

I tend to take either relatively brief vacations (a day or two at a time) or very long ones. Brief vacations serve specific purposes, so they only mentally recharge me a little. Only long vacations let me set my head straight. Recent long vacations have been:

2008: thru-hiking the Appalachian Trail, over nearly five months;
2010: thru-hiking the John Muir Trail, over two and a half weeks;
2011: hiking/walking/pub-crawling the Coast to Coast Walk across northern England with my dad, over a few weeks;
2012: biking across the United States, over thirty-seven days;
2014: biking from the Bay Area to the eastern end of the Phoenix metro area for my grandma’s 90th birthday, over eight days.

I haven’t taken any long trips to unwind since 2014. Injuries (a persistent high ankle sprain ultimately requiring arthroscopic surgery, a stress fracture to the same foot) are partly to blame. Regardless, I haven’t fully decompressed in a very long time.

During the last weeks of the A.T. thru-hike, I stayed at a hostel with a Pacific Crest Trail thru-hiker guidebook. When I finished reading it, I knew I would hike the PCT. I wasn’t sure when, but I knew it would happen.

This year’s the year.

Conventional wisdom holds that if you can hike the ~2175mi A.T. in N months, the ~2650mi PCT will take N – 1 months. (The A.T. is a rugged trail of rocks and roots; the PCT is a well-graded horse trail.) Obviously this breaks down eventually, and I suspect a relatively-fast 137-day A.T. pace passes that point. I’m guessing I’ll need four months: 240 hours of PTO (Nᴇᴡ Hɪɢʜ Sᴄᴏʀᴇ!), then three months’ unpaid leave, including a cushion. I’m guessing I’ll be done by mid-September.

The A.T. is generally non-technical. No special equipment is required (except during snow extremes at the north end). Civilization is almost always nearby. The PCT is more technical, with a trail unmarked by omnipresent white blazes or signs regularly identifying the PCT. Lingering snow may completely obscure the trail. Swollen, icy creek crossings present dangers I only approached a single day on the A.T. Resupply locations are less frequent and comprehensive. It’s necessary to mail food to oneself at certain points, to resupply at all at them.

New equipment I've had to pick up for the PCT: crampons; wraparound, UV-blocking, polarized sunglasses; crampons — With trekking pole/ice pick in hand, I’m ready for my next political assassination

The learning curve for the PCT is steeper than for the A.T. Some people might worry about this, but I won’t be one of them. Worrying isn’t helpful: why allow it take root?

Caution and preparedness are different matters. For example, I know ~zero about safely hiking through alpine snow. And this year was a roughly every-six-year snow year, maybe worse in localized areas. But I can address that with training. There should be room to learn other PCT peculiarities in the first few hundred miles.

A graph of California north/central/south snowpack over each season, for various winter seasons; 1982-1983 establishes a high mark, 2016-2017/2010-2011/2005-2006 are high but not historically so, and 2013-2014 and 2014-2015 are around the recorded minimums — California north/central/south snowpack measurements for various recent seasons

My only uncontrollable concern is that foot stress fracture. It happened two and a half years ago; the fracture has healed; and I’ve walked, run, and hiked on it for a year. A sports medicine doctor cleared me to walk to Canada on it. (I was very specific about doing exactly that.) But my right big toe still isn’t 100% flexible, its ligaments semi-regularly ache during or after exercise, and it sometimes bruises. I’ll do what I can about this through cushioned socks, ongoing flexibility exercises, and moderating pace if needed. But I can’t eliminate the risk that it might significantly slow me down or even stop me.

If all goes well, I’ll return in September, mentally refreshed, with a peculiarly developed endurance for walking and very few fast-twitch muscle fibers. 🙂 I’ve turned off Bugzilla request capabilities, so don’t try asking me to review patches. If you must contact me, email might work. I’ll rely heavily on server-side filters to keep the firehoses hidden, but that doesn’t mean I’ll necessarily see an email. If possible send reviews and questions to the usual suspects. Moreover, the PCT is much more remote than the A.T., so I’ll likely go longer between email access than I did on the A.T. In places, a two-week delay in responding would not be unusual. (But I’ll try to keep people updated on where I am whenever possible, specifically to reduce the risks and dangers in a moronically-avoidable 127 Hours-style rescue snafu. I put the best odds on Twitter updates because they’re quickest. But I’ll post some pictures here as well to one-up roc. “My country’s scenery beat up your country’s scenery”)

I’m currently on my way to San Diego. Some helpful souls who love the trail (“trail angels”, in the vernacular) offer aspiring thru-hikers a place to stay just before, and a ride to the start the day of, their thru-hikes. I’ll stay tonight with them. Tomorrow the rubber hits the trail. It should be good.

Comments (7)

27.02.17 A pitfall in C++ low-level object creation and storage, and how to avoid it

Tags: alignment, c++, c++11, compilers, memcpy, mozilla, programming language, std::aligned_storage, undefined behavior — Jeff @ 14:17

While doing a couple recent reviews, I’ve read lots of code trying to solve the same general problem. Some code wants to store an object of some type (more rarely, an object from among several types), but it can’t do so yet. Some series of operations must occur first: a data structure needing to be put into the right state, or a state machine needing to transition just so. Think of this, perhaps, as the problem of having a conditionally initialized object.

Possibly-unsatisfactory approaches

One solution is to new the object at the proper time, storing nullptr prior to that instant. But this imposes the complexity of a heap allocation and handling its potential failure.

Another solution is to use mozilla::Maybe<T>. But Maybe must track whether the T object has been constructed, even if that status is already tracked elsewhere. And Maybe is potentially twice T‘s size.

The power approach

Tthe most common solution is to placement-new the object into raw storage. (This solution is particularly common in code written by especially-knowledgeable developers.) mfbt has historically provided mozilla::AlignedStorage and mozilla::AlignedStorage2 to implement such raw storage. Each class has a Type member typedef implementing suitable aligned, adequately-sized storage. Bog-standard C++11 offers similar functionality in std::aligned_storage.

Unfortunately, this approach is extremely easy to subtly misuse. And misuse occurs regularly in Mozilla code, and it can and has triggered crashes.

A detour into the C++ object model

C++ offers very fine-grained, low-level control over memory. It’s absurdly easy to access memory representations with the right casts.

But just because it’s easy to access memory representations, doesn’t mean it’s easy to do it safely. The C++ object model generally restricts you to accessing memory according to the actual type stored there at that instant. You can cast a true float* to uint32_t*, but C++ says undefined behavior — literally any behavior at all — may occur if you read from the uint32_t*. These so-called strict aliasing violations are pernicious not just because anything could happen, but because often exactly what you wanted to happen, happens. Broken code often still “works” — until it unpredictably misbehaves, in C++’s view with absolute justification.

A dragon lays waste to men attacking it — Here be dragons (CC-BY-SA, by bagogames)

There’s a big exception to the general by-actual-type rule: the memcpy exception. (Technically it’s a handful of rules that, in concert, permit memcpy. And there are other, non-memcpy-related exceptions.) memcpy(char*, const char*, size_t) has always worked to copy C-compatible objects around without regard to types, and C++’s object model permits this by letting you safely interpret the memory of a T as chars or unsigned chars. If T is trivially copyable, then:

T t1; 
char buf[sizeof(T)];
memcpy(buf, &t1, sizeof(T)); // stash bytes away
// ...time elapses during execution...
memcpy(&t1, buf, sizeof(T)); // restore them
// t1 safely has its original value

You can safely copy a T by the aforementioned character types elsewhere, then back into a T, and things will work. And second:

T t1, t2;
memcpy(&t1, &t2, sizeof(T)); // t1 safely has t2's value

You can safely copy a T into another T, and the second T will have the first T‘s value.

A C++-compatible placement-`new` approach

The placement-new-into-storage approach looks like this. (Real code would almost always use something more interesting than double, but this is the gist of it.)

#include <new> // for placement new

struct ContainsLazyDouble
{
    // Careful: align the storage consistent with the type it'll store.
    alignas(double) char lazyData[sizeof(double)];
    bool hasDouble_;

    // Indirection through these functions, rather than directly casting
    // lazyData to double*, evades a buggy GCC -Wstrict-aliasing warning.
    void* data() { return lazyData; }
    const void* data() const { return lazyData; }

  public:
    ContainsLazyDouble() : hasDouble_(false) {}

    void init(double d) {
      new (data()) double(d);
      hasDouble_ = true;
    }

    bool hasDouble() const { return hasDouble_; }
    double d() const {
      return *reinterpret_cast<const double*>(data());
    }
};

ContainsLazyDouble c;
// c.d(); // BAD, not initialized as double
c.init(3.141592654);
c.d(); // OK

This is safe. c.lazyData was originally char data, but we allocated a new double there, so henceforth that memory contains a double (even though it wasn’t declared as double), not chars (even though it was declared that way). The actual type stored in lazyData at that instant is properly respected.

A C++-incompatible extension of the placement-`new` approach

It’s safe to copy a T by the aforementioned character types. But it’s only safe to do so if 1) T is trivially copyable, and 2) the copied bytes are interpreted as T only within an actual T object. Not into a location to be (re)interpreted as T, but into a T. It’s unsafe to copy a T into a location that doesn’t at that instant contain a T, then reinterpret it as T.

So what happens if we use ContainsLazyDouble‘s implicitly-defined default copy constructor?

ContainsLazyDouble c2 = c;

This default copy constructor copies ContainsLazyDouble member by member, according to their declared types. So c.lazyData is copied as a char array that contains the object representation of c.d(). c2.lazyData therefore contains the same char array. But it doesn’t contain an actual double. It doesn’t matter that those chars encode a double: according to C++, that location does not contain a double.

Dereferencing reinterpret_cast<const double*>(data()) therefore mis-accesses an array of chars by the wrong type, triggering undefined behavior. c2.d() might seem to work if you’re lucky, but C++ doesn’t say it must work.

This is extraordinarily subtle. SpiderMonkey hackers missed this issue in their code until bug 1269319 was debugged and a (partly invalid, on other grounds) GCC compiler bug was filed. Even (more or less) understanding the spec intricacy, I missed this issue in some of the patchwork purporting to fix that bug. (Bug 1341951 provides an actual fix for one of these remaining issues.) Another SpiderMonkey hacker almost introduced another instance of this bug; fortunately I was reviewing the patch and flagged this issue.

Using AlignedStorage<N>::Type, AlignedStorage2<T>::Type, or std::aligned_storage<N>::type doesn’t avoid this problem. We mitigated the problem by deleting AlignedStorage{,2}::Type‘s copy constructors and assignment operators that would always do actual-type-unaware initialization. (Of course we can’t modify std::aligned_storage.) But only careful scrutiny prevents other code from memcpying those types. And memcpy will copy without respecting the actual type stored there at that instant, too. And in practice, developers do try to use memcpy for this when copy construction and assignment are forbidden, and reviewers can miss it.

What’s the solution to this problem?

As long as memcpy and memmove exist, this very subtle issue can’t be eradicated. There is no silver bullet.

The best solution is don’t hand-roll raw storage. This problem doesn’t exist in Maybe, mozilla::Variant, mozilla::MaybeOneOf, mozilla::Vector, and other utility classes designed to possibly hold a value. (Sometimes because we just fixed them.)

But if you must hand-roll a solution, construct an object of the actual type into your raw storage. It isn’t enough to copy the bytes of an object of the actual type into raw storage, then treat the storage as that actual type. For example, in ContainsLazyDouble, a correct copy constructor that respects C++ strict aliasing rules would be:

#include <string.h> // for memcpy

// Add this to ContainsLazyDouble:
ContainsLazyDouble(const ContainsLazyDouble& other)
  : hasDouble_(other.hasDouble_)
{
  if (hasDouble_)
  {
    // The only way to allocate a free-floating T, is to
    // placement-new it, usually also invoking T's copy
    // constructor.
    new (data()) double(other.d());

    // This would also be valid, if almost pointlessly bizarre — but only
    // because double is trivially copyable.  (It wouldn't be safe
    // to do this with a type with a user-defined copy constructor, or
    // virtual functions, or that had to do anything at all to initialize
    // the new object.)
    new (data()) double; // creates an uninitialized double
    memcpy(lazyData, other.lazyData, sizeof(lazyData)); // sets to other.d()
  }
}

// ...and this to the using code:
ContainsLazyDouble c2 = c; // invokes the now-safe copy constructor

The implicitly-generated copy assignment operator will usually require similar changes — or it can be = deleted.

Final considerations

AlignedStorage seems like a good idea. But it’s extremely easy to run afoul of a copy operation that doesn’t preserve the actual object type, by the default copy constructor or assignment operator or by memcpy in entirely-separate code. We’re removing AlignedStorage{,2} so these classes can’t be misused this way. (The former has just been removed from the tree — the latter has many more users and will be harder to kill.) It’s possible to use them correctly, but misuse is too easy to leave these loaded guns in the tree.

If it’s truly necessary to hand-roll a solution, you should hand-roll all of it, all the way down to the buffer of unsigned char with an alignas() attribute. Writing this correctly this is expert-level C++. But it was expert-level C++ even with the aligned-storage helper. You should have to know what you’re doing — and you shouldn’t need the std::aligned_storage crutch to do it.

Moreover, I hope that the extra complexity of hand-rolling discourages non-expert reviewers from reviewing such code. I’d feel significantly more confident Mozilla won’t repeat this mistake if I knew that every use of (for example) alignas were reviewed by (say) froydnj or me. Perhaps we can get some Mercurial tooling in place to enforce a review requirement along those lines.

In the meantime, I hope I’ve made the C++ developers out there who read this, at least somewhat aware of this pitfall, and at best competent to avoid it.

Comments (15)

27.05.16 Using Yubikeys with Fedora 24, for example for Github two-factor authentication

Tags: fedora, fido, Linux, mozilla, security, skylake, u2f — Jeff @ 17:17

My old laptop’s wifi went on the fritz, so I got a new Lenovo P50. Fedora 23 wouldn’t work with the Skylake architecture, so I had to jump headfirst into the Fedora 24 beta.

I’ve since hit one new issue: Yubikeys wouldn’t work for FIDO U2F authentication. Logging into a site using a Yubikey (inserting a Yubikey USB device and tapping the button when prompted) wouldn’t work. Attempting this on Github would display the error message, “Something went really wrong.” Nor would registering Yubikeys with sites work. On Github, attempting to register Yubikeys would give the error message, “This device cannot be registered.”

Interwebs sleuthing suggests that Yubikeys require special udev configuration to work on Linux. The problem is that udev doesn’t grant access to the Yubikey, so when the browser tries to access the key, things go Bad. A handful of resources pointed me toward a solution: tell udev to grant access to the device.

As root, go to the directory /etc/udev/rules.d. It contains files with names of the form *.rules, specifying rules for how to treat devices added and removed from the system. In that directory create the file 70-u2f.rules. Its contents should be those of 70-u2f.rules, from Yubico‘s libu2f-host repository. (Most of this file is just selecting various Yubikey devices to apply rules against. The important part of this file is the TAG+="uaccess" ending the various lines. This adds the “uaccess” tag to those devices; systemd-logind recognizes this tag and will grant access to the device to the current logged-in user.) Finally, run these two commands to refresh udev state:

udevadm control --reload
udevadm trigger

Yubikeys should now work for authentication.

These steps work for me, and they appear to me a sensible way to solve the problem. But I can’t say for sure that they’re the best way to solve it. (Nor am I sure why Fedora doesn’t handle this for me.) If anyone knows a better way, that doesn’t involve modifying the root file system, I’d love to hear it in comments.

Comments (3)

20.06.15 New changes to make SpiderMonkey’s (and Firefox’s) parsing of destructuring patterns more spec-compliant

Tags: destructuring, destructuring assignment, ecma-262, ecmascript, es6, firefox, javascript, js, mozilla, spidermonkey — Jeff @ 03:17

Destructuring in JavaScript

One new feature in JavaScript, introduced in ECMAScript 6 (formally ECMAScript 2015, but it’ll always be ES6 in our hearts), is destructuring. Destructuring is syntactic sugar for assigning sub-values within a single value — nested properties, iteration results, &c., to arbitrary depths — to a set of locations (names, properties, &c.).

// Declarations
var [a, b] = [1, 2]; // a = 1, b = 2
var { x: c, y: d } = { x: 42, y: 17 }; // c = 42, d = 17

function f([z]) { return z; }
print(f([8675309])); // 8675309


// Assignments
[b, f.prop] = [3, 15]; // b = 3, f.prop = 15
({ p: d } = { p: 33 }); // d = 33

function Point(x, y) { this.x = x; this.y = y; }

// Nesting's copacetic, too.
// a = 2, b = 4, c = 8, d = 16
[{ x: a, y: b }, [c, d]] = [new Point(2, 4), [8, 16]];

Ambiguities in the parsing of destructuring

One wrinkle to destructuring is its ambiguity: reading start to finish, is a “destructuring pattern” instead a literal? Until any succeeding = is observed, it’s impossible to know. And for object destructuring patterns, could the “pattern” just be a block statement? (A block statement is a list of statements inside {}, e.g. many loop bodies.)

How ES6 handles the potential parser ambiguities in destructuring

ES6 says an apparent “pattern” could be any of these possibilities: the only way to know is to completely parse the expression/statement. There are more elegant and less elegant ways to do this, although in the end they amount to the same thing.

Object destructuring patterns present somewhat less ambiguity than array patterns. In expression context, { may begin an object literal or an object destructuring pattern (just as [ does for arrays, mutatis mutandis). But in statement context, { since the dawn of JavaScript only begins a block statement, never an object literal — and now, never an object destructuring pattern.

How then to write object destructuring pattern assignments not in expression context? For some time SpiderMonkey has allowed destructuring patterns to be parenthesized, incidentally eliminating this ambiguity. But ES6 chose another path. In ES6 destructuring patterns must not be parenthesized, at any level of nesting within the pattern. And in declarative destructuring patterns (but not in destructuring assignments), declaration names also must not be parenthesized.

SpiderMonkey now adheres to ES6 in requiring no parentheses around destructuring patterns

As of several hours ago on mozilla-inbound, SpiderMonkey conforms to ES6’s parsing requirements for destructuring, with respect to parenthesization. These examples are all now syntax errors:

// Declarations
var [(a)] = [1]; // BAD, a parenthesized
var { x: (c) } = {}; // BAD, c parenthesized
var { o: ({ p: p }) } = { o: { p: 2 } }; // BAD, nested pattern parenthesized

function f([(z)]) { return z; } // BAD, z parenthesized

// Top level
({ p: a }) = { p: 42 }; // BAD, pattern parenthesized
([a]) = [5]; // BAD, pattern parenthesized

// Nested
[({ p: a }), { x: c }] = [{}, {}]; // BAD, nested pattern parenthesized

Non-array/object patterns in destructuring assignments, outside of declarations, can still be parenthesized:

// Assignments
[(b)] = [3]; // OK: parentheses allowed around non-pattern in a non-declaration assignment
({ p: (d) } = {}); // OK: ditto
[(parseInt.prop)] = [3]; // OK: parseInt.prop not a pattern, assigns parseInt.prop = 3

Conclusion

These changes shouldn’t much disrupt anyone writing JS. Parentheses around array patterns are unnecessary and are easily removed. For object patterns, instead of parenthesizing the object pattern, parenthesize the whole assignment. No big deal!

// Assignments
([b]) = [3]; // BAD: parentheses around array pattern
[b] = [3]; // GOOD

({ p: d }) = { p: 2 }; // BAD: parentheses around object pattern
({ p: d } = { p: 2 }); // GOOD

One step forward for SpiderMonkey standards compliance!

Comments (1)

Where's Walden?

Mozilla, politics, economics, law, backpacking, cycling, and other random desiderata

21.05.18

PSA: stop using `mozilla::PodZero` and `mozilla::PodArrayZero`

The briefest recap of salient parts of the C++ object model

`memset`ting an object

Problem intensifies

Fixing the problem by not using `mozilla::PodZero`

`mozilla::PodZero` and `mozilla::PodArrayZero` are deprecated

07.05.17

Back in a bit

27.02.17

A pitfall in C++ low-level object creation and storage, and how to avoid it

Possibly-unsatisfactory approaches

The power approach

A detour into the C++ object model

A C++-compatible placement-`new` approach

A C++-incompatible extension of the placement-`new` approach

What’s the solution to this problem?

Final considerations

27.05.16

Using Yubikeys with Fedora 24, for example for Github two-factor authentication

20.06.15

New changes to make SpiderMonkey’s (and Firefox’s) parsing of destructuring patterns more spec-compliant

Destructuring in JavaScript

Ambiguities in the parsing of destructuring

How ES6 handles the potential parser ambiguities in destructuring

SpiderMonkey now adheres to ES6 in requiring no parentheses around destructuring patterns

Conclusion

Where's Walden?

Mozilla, politics, economics, law, backpacking, cycling, and other random desiderata

21.05.18

PSA: stop using mozilla::PodZero and mozilla::PodArrayZero

The briefest recap of salient parts of the C++ object model

memsetting an object

Problem intensifies

Fixing the problem by not using mozilla::PodZero

mozilla::PodZero and mozilla::PodArrayZero are deprecated

07.05.17

Back in a bit

27.02.17

A pitfall in C++ low-level object creation and storage, and how to avoid it

Possibly-unsatisfactory approaches

The power approach

A detour into the C++ object model

A C++-compatible placement-new approach

A C++-incompatible extension of the placement-new approach

What’s the solution to this problem?

Final considerations

27.05.16

Using Yubikeys with Fedora 24, for example for Github two-factor authentication

20.06.15

New changes to make SpiderMonkey’s (and Firefox’s) parsing of destructuring patterns more spec-compliant

Destructuring in JavaScript

Ambiguities in the parsing of destructuring

How ES6 handles the potential parser ambiguities in destructuring

SpiderMonkey now adheres to ES6 in requiring no parentheses around destructuring patterns

Conclusion

PSA: stop using `mozilla::PodZero` and `mozilla::PodArrayZero`

`memset`ting an object

Fixing the problem by not using `mozilla::PodZero`

`mozilla::PodZero` and `mozilla::PodArrayZero` are deprecated

A C++-compatible placement-`new` approach

A C++-incompatible extension of the placement-`new` approach