Using Yubikeys with Fedora 24, for example for Github two-factor authentication

Tags: , , , , , , — Jeff @ 17:17

My old laptop’s wifi went on the fritz, so I got a new Lenovo P50. Fedora 23 wouldn’t work with the Skylake architecture, so I had to jump headfirst into the Fedora 24 beta.

I’ve since hit one new issue: Yubikeys wouldn’t work for FIDO U2F authentication. Logging into a site using a Yubikey (inserting a Yubikey USB device and tapping the button when prompted) wouldn’t work. Attempting this on Github would display the error message, “Something went really wrong.” Nor would registering Yubikeys with sites work. On Github, attempting to register Yubikeys would give the error message, “This device cannot be registered.”

Interwebs sleuthing suggests that Yubikeys require special udev configuration to work on Linux. The problem is that udev doesn’t grant access to the Yubikey, so when the browser tries to access the key, things go Bad. A handful of resources pointed me toward a solution: tell udev to grant access to the device.

As root, go to the directory /etc/udev/rules.d. It contains files with names of the form *.rules, specifying rules for how to treat devices added and removed from the system. In that directory create the file 70-u2f.rules. Its contents should be those of 70-u2f.rules, from Yubico‘s libu2f-host repository. (Most of this file is just selecting various Yubikey devices to apply rules against. The important part of this file is the TAG+="uaccess" ending the various lines. This adds the “uaccess” tag to those devices; systemd-logind recognizes this tag and will grant access to the device to the current logged-in user.) Finally, run these two commands to refresh udev state:

udevadm control --reload
udevadm trigger

Yubikeys should now work for authentication.

These steps work for me, and they appear to me a sensible way to solve the problem. But I can’t say for sure that they’re the best way to solve it. (Nor am I sure why Fedora doesn’t handle this for me.) If anyone knows a better way, that doesn’t involve modifying the root file system, I’d love to hear it in comments.


Iterating a number sequence for lulz and jail time

Hello, readers! Today I bring you two posts about law: one Mozilla-related, one not. This is the Mozilla-related post. Mozillians may already know this background, but I’ll review for those who don’t.

The “hack”

In 2010 Goatse Security (don’t look them up) discovered a flaw in AT&T’s website. AT&T’s site detected accesses from iPads, extracted a unique account number sent by the iPad, then replied with a private account email address. Account numbers were guessable, so if someone “spoofed” their UA to look like the iPad browser, they could harvest private email addresses using their guesses.

The lulz

Andrew Auernheimer ("weev") wearing an old-school AT&T baseball cap
Andrew Auernheimer, i.e. weev, CC-BY-SA

The people who figured this out were classic Internet trolls interested (to a degree) in minor mayhem (“lulz”) because they could, and they scraped 114000+ email addresses. Eventually Andrew Auernheimer (known online as “weev”) sent the list to Gawker for an exclusive.

The sky is falling!

AT&T, Apple, the people whose addresses had been scraped, and/or the government panicked and freaked out. The government argued that Auernheimer violated the Computer Fraud and Abuse Act, “exceeding authorized access” by UA-spoofing and loading pages using guessed account numbers.

This is a broad interpretation of “authorized access”. Auernheimer evaded no security measures, only accessed public, non-login-protected pages using common techniques. Anyone who could guess the address could view those pages using common browser addons. People guess at the existence of web addresses all the time. This site’s addresses appear of the form “/year/month/day/post-title/”. The monthly archive links to the side on my site have the form “/year/month/”. It’s a good guess that changing these components does what you expect: no dastardly hacking skills required, just logical guesses and experimentation. And automation’s hardly nefarious.

So what’s Mozilla’s brief with this?

Developers UA-spoof all the time for a variety of innocuous reasons. Newspapers have UA-spoofed during online price discrimination investigations. If UA spoofing is a crime, many people not out for lulz are in trouble, subject to a federal attorney’s whims.

The same is true for constructing addresses by modifying embedded numbers. I’ve provided one example. Jesse once wrote a generic implementation of the technique. Wikipedia uses these tactics internally, for example in the Supreme Court infobox template to linkify docket numbers.

Mozilla thus signed onto an amicus brief in the case. The brief laid out the reasons why the actions the government considered criminal, were “commonplace, legitimate techniques”.

The cool part of the brief

I read the brief last summer through one of Auernheimer’s attorneys at the inestimable Volokh Conspiracy. I’ve been lightly meaning to blog about this discussion of number-changing ever since:

Changing the value of X in the AT&T webpage address is trivial to do. For example, to visit this Court’s homepage, one might type the address “http://www.ca3.uscourts.gov/” into the address bar of the browser window. The browser sends an HTTP request to the Court website, which will respond with this Court’s homepage. Changing the “3” to “4” by typing in the browser window address bar returns the Court of Appeals for the Fourth Circuit’s homepage. Changing the “3” to a “12” returns an error message.

Illustrating the number-guessing technique (and implying its limitations in the “12” part) via the circuit courts’ own websites? Brilliant.

Back to Auernheimer

The court recently threw out Auernheimer’s conviction. Not on CFAA grounds — on more esoteric matters of filing the case in the wrong court. But the opinion contains dicta implying that breaching a password gate or code-based barrier may be necessary to achieve a conviction. The government could bring the case in the right court, but with the implied warning here, it seems risky.


Auernheimer isn’t necessarily a sympathetic defendant. It’s arguably impolite and discourteous to publicly disclose a site vulnerability without giving the site notice and time to fix the issue. It may be “hard to feel sorry for them being handed federal criminal charges” as Ars Technica suggested.

But that doesn’t mean he committed a crime or shouldn’t be defended for doing things web developers often do. Justice means defending people who have broken no laws, when they are threatened with prosecution. It doesn’t mean failing to defend someone just because you don’t like his (legal) actions. Prosecution here was wrong.

One final note

I heard about the AT&T issue and the brief outside Mozilla. I’m unsure what Mozilla channel I should have followed, to observe or discuss the decision to sign onto this brief. Mozilla was right to sign on here. But our input processes for that decision could be better.


Why Firefox 3.0.11 gained a point on Acid3

Astute observers of the recent Firefox 3.0.11 release will note that this release gains a point on the Acid3 test, specifically fixing test 68, involving a difference in how Mozilla processes UTF-16 from what the Unicode specification requires. (Indeed, it seems Wikipedia’s Acid3 article was updated to reflect this on the very day of the 3.0.11 release, and that edit has even been reverted and un-reverted once. One does wonder at times whether these people don’t have better things to do with their time. 🙂 ) What did test 68 check, why was it fixed, and why was it fixed in a dot release of a stable version? If you’re curious about this and are willing to dive into technical details at some depth, keep reading. If you aren’t, I write here for the pleasure of a presumptively-interested audience whether or not any such thing exists for the topic at hand, so you’ll forgive me for not caring too much what you do. 🙂

Representing text through Unicode and UTF-16

Without descending into too much detail, modern computer programs represent text as a sequence of numbers, one number for each “character” in the text. Unicode defines each such number as a code point and assigns each number a particular meaning (e.g. 99 is “LATIN SMALL LETTER C”, that is, the letter ‘c’). Such an arbitrary sequence can’t simply be represented as itself partly because computers can only efficiently handle numbers whose values are within small, fixed ranges, and code points cover too large a range (0 to 1114111) to handle directly. Unicode therefore defines a small number of well-known, widely adopted processes to convert an idealized sequence of code points into computer-friendly sequences.

UTF-16 is one way to represent a sequence of code points, using a sequence of 16-bit numbers. Broadly speaking, code points in the range 0 to 65535 are represented using the identical number; code points 65536 and greater are represented by a pair of 16-bit numbers. There’s an obvious problem: how do you determine if a 16-bit number encodes a single code point or half of one? Basically, not every 16-bit number is a code point; there are intentionally no code points from hexadecimal 0xD800 through 0xDFFF (55296 through 57343). With a little care, we can use two (exactly two, no more, no fewer) numbers from this range to represent a code point. For example, the hexadecimal number 0xD863, 55395 in decimal, does not correspond to a valid Unicode code point, nor does 0xDDDD, 56797 in decimal. The sequence 0xDDDD 0xD863 also corresponds to no Unicode code point or sequence of code points. However, 0xD863 0xDDDD represents the Unicode code point with the hexadecimal value 0x28DDD, 167389 in decimal.

Acid3 test 68 and its fix

The failing test 68 examined how to interpret a purportedly UTF-16 value which actually was in error. Specifically, consider a sequence of 16-bit values like so, to be interpreted as UTF-16:

Index Value description Example
0 Value appearing to be the first in a valid pair (a high surrogate) 0xD863 (decimal 55395)
1 Value not appearing to be the second in a valid pair (instead appearing to be a value in range [0, 65536) validly represented as itself) 0x61 (decimal 97)
2 Value in range [0, 65536) validly represented as itself 0x61 (decimal 97)

How should such “UTF-16” data be interpreted? First, note that Unicode includes the concept of a “replacement character”, used to fill in for the malformed parts of ostensibly-UTF-16 data when interpreting it. With that concept in mind, we have three plausible ways to interpret this sequence of values:

  • complain that the data wasn’t UTF-16 and refuse to interpret it
  • interpret as:
    • replacement character, identical value at index 1, identical value at index 2
    • replacement character, identical value at index 2

In fact if you look at test 68 itself, you’ll see a variety of responses are (to put it as conservatively as possible, since the specs are currently insufficiently precise to admit only one correct behavior) “not prohibited” by the relevant specifications. Mozilla at the time chose to interpret such data as the last of the three possibilities, thinking that the pair of 16-bit values was invalid rather than merely the first of them (in which case the second of the “pair” would be interpreted as its own value or start of a pair). However, the Unicode standard didn’t permit this choice, and neither did Acid3; further considering that other browsers had correct (and just as important, different) behavior, it made sense to change our interpretation in this case. I pushed a fix to the mozilla-central repository very nearly a year ago, and I made no attempt to get it in 3.0, for two reasons. First, the bug didn’t matter in the real world (web developers are very unlikely to have relied on the previous behavior, and the fix enables no new, desirable functionality); second, I knew we didn’t have enough time before the release to sniff out all potential regressions from such a low-level change and be confident nothing had been broken.

Acid3 refldux

Fast-forward a year later, however, and suddenly the fix for test 68 is in 3.0.11. The test, while important to fix eventually for a number of reasons, is not especially important for real-world behavior (a flaw of many aspects of Acid3, notwithstanding its many useful tests of desirable functionality, but I digress), so why fix it now rather than in the 3.5 release? Surely such a non-essential bugfix is better left to 3.5 as it already had been, right? That’s certainly true enough if our assumptions are correct — but in this case, curiously, they’re not.

One of the things that makes changing how character encoding works so exciting is that it affects a lot of other code. A small bugfix to such code usually has no effect on properly formed input, because such input is the normal case that receives regular testing. Improperly formed input, however, may cause immediate problems (which are usually simple to smoke out through careful creation and use of automated tests) or problems in other code at miles of distance (which are far more difficult to discover). Suppose data is interpreted by two different decoding implementations, producing two different idealized representations — what are the consequences? Maybe the lengths will differ. If those implementations are used in code written by memory nazis, perhaps a string will be copied incorrectly and result in a buffer overflow. Algorithmic differences might also throw off hashing schemes that use string length in computing a hash key. Of course, if lengths differ the characters must differ as well. That might cause a CSS selector to not apply correctly, or it might introduce a vector for slipping forbidden characters through an anti-XSS filter. Character encoding and decoding is fundamental to any number of other systems which rely on precise, correct behavior in order to work properly. If you don’t have that behavior, all bets are off.

In this case, as it turns out, two separate bugs were uncovered which my proposed patch fixed: bug 490513 and bug 489041. If you read the details in each bug, you’ll note that there’s very little in either bug to suggest why my change might have any effect. To be sure, both testcases deal with strings containing problematic sequences as above, but nothing in the testcase explicitly suggests UTF-16 decoding is happening.

A useful first step in examining any bug in a bug database is to look at its ancestry. Curiously, bug 490513 is a Bugzilla clone of bug 439206, with the same steps to reproduce and the same testcase. From there we proceed to the fix for the bug. The patch is small, and if one understands the relevant code it’s similarly easy to understand, but

The comment in the patch looks strangely familiar.

In fact I seem to remember exactly that comment in my patch for bug 421576 that fixed test 68. (Lest I be misconstrued, this comment-cribbing is perfectly acceptable in open source code, indeed is even stylistically better for demonstrating consistent intent in the code. However, your mileage may vary in other code or if perchance you happen to work for Microsoft.)

Also suspicious: this bug was filed roughly a week after I pushed 421576 to mozilla-central. A little more investigation confirms the obvious conclusion: bug 439206 was a regression from bug 421576, vindicating my initial thoughts in bug 421576 that “we couldn’t reasonably take this now and expect to be able to sniff out all possible regressions”. It seems that I updated most of our decoding code to handle lone high surrogates correctly, but I missed the spot being fixed in this patch. The code being patched here handles string hashing within Mozilla’s hash tables, and if you read the other bug comments you can see the testcase is causing a string to be hashed using two different algorithms (a failure mode I mentioned earlier), and as the computed hashes differed things went awry.

Here’s where Denmark turned rotten: this change was deemed important to fix for 3.0 point releases, but because no one noticed it was a trunk-only regression and thus didn’t need to be fixed in a security updates, it was backported to 3.0. The problem on trunk was that I only updated half the decoding algorithms in bug 421576, so fixing bug 439206 fixed the other half and brought them into sync. What did fixing bug 439206 in 3.0 do? It updated the other half of the decoding algorithms and took that half out of sync with what the other half was before 421576! Our trunk problem — that we changed one decoding algorithm but not a second that needed to be synchronized with it — had been ported in mirror image to 3.0 point releases. This problem, then, triggered the filing of bug 490513 (incidentally regressing bug 489041 as well), and for precisely the same reasons bug 439206 was marked as security-sensitive until being fixed, bug 490513 was marked as security-sensitive.

Let’s recap:

  1. I fix bug 421576 (and Acid3 test 68) in mozilla-central.
  2. This causes the security-sensitive regression bug 439206.
  3. Bug 439206 is investigated and fixed in mozilla-central.
  4. This fixes the potential vulnerability I introduced.
  5. People recognize 439206 as potentially dangerous but not as a regression, so the same fix is added to 3.0 point release code.
  6. This causes the security-sensitive regression bug 490513 (and bug 489041 as well), because we’re missing the first half of the code (in 421576) that caused 439206.
  7. That bug is investigated by the reviewer of my fix for 421576, who correctly hypothesizes that my fix will fix that bug without determining exactly why.
  8. 421576 is fixed in 3.0.11.
  9. This fixes the potential security vulnerability in 3.0.11 (and less importantly, Acid3 test 68).
  10. Both halves (421576 and 439206) are now fixed in mozilla-central and in 3.0 point release code.

…and people wonder why we’re so hesitant to fix anything except security bugs in updates to stable releases, instead deferring to the subsequent major release. The potential for error, even in code that’s been written and reviewed by four different people across the two bugs, introduces a far higher cost than can be offset by the value in fixing nearly any non-security, non-stability bug.

Lessons for the future

All this mess is now behind us with the 3.0.11 release. However, a number of the causes for failure here are not peculiar to the precise bugs fixed here. What can we learn from this that can be applied in the future?

Make string encoding/decoding code simpler

First, and most pertinent to the case at hand, string encoding and decoding are complex, and we need to do everything we can to make this code simpler. A large part of the reason 439206 was missed when fixing 421576 was that the relevant code was not part of Mozilla’s string code — instead of residing in xpcom/string, it was in xpcom/ds. It’s difficult to argue that a data structure used to hold atomized strings (that is, strings which uniquely identify a sequence of characters, making a comparison of two atomized strings as fast as comparing two numbers) shouldn’t reside in a data structures (ds) directory. However, the code to compute the hash of the string (a single number that summarizes a string’s contents, hopefully uniquely but possibly non-uniquely) must simultaneously decode the string, and that code should be in xpcom/string with all other string decoding code.

In response to this state of relative disarray, I’ve filed bug 497204 to reorganize and consolidate Mozilla’s string code, with the primary goal of getting it all in one location (so that even if you’re unfamiliar with it, you at least know all the code you might need to read) and with the secondary goal of organizing it in a clearer and simpler fashion (to make the consolidated code easier to read). We may still end up making mistakes in how we handle string encoding and decoding even with that work complete, but those mistakes will be easier to find, diagnose, and completely fix.

Write more automated tests, and make it possible to include them in security fixes

Second, we need to move as much testing like this out of human hands and into “computer hands” as possible. One reason we introduced a regression was that we relied on fallible human testing to varying degrees throughout this whole process. I relied on some informal testing of my original patch in deeming it complete; the regression fix did likewise as manual QA testing verified the problem had been fixed on trunk and then later discovered the bug not to be fixed in 3.0.x releases (after the unnecessary backport). Yet I didn’t uncover my omission, and QA didn’t notice the backport caused the branch regression rather than fixed it. Some amount of manual testing, formalized through QA or otherwise through masses of nightly testers, is both desirable and unavoidable. Most of the time, however, we are much better off with automated tests that can be run with much less effort, on a much shorter time scale, with much greater rigor, and with much less chance of mistakes in their execution.

Now, to be fair, automated tests have less value here than in many other case. I included automated tests in my original patch, but they only addressed the bug at hand and not the regression (else that regression would never have occurred). It’s true that the followup would have been helped by an automated test, because when it was ported to branch it would have failed, and the backport would have been reverted pending further investigation. (Whether this investigation would have led to discovering bug 439206 to be a trunk-only regression is unclear, but it certainly seems plausible.) Here, however, we encounter the large problem that we currently can’t include automated tests in security bug fixes because we don’t want to tip our hands before a release with the fix is available. (In cases such as this one this worry is perhaps more paranoid than well-founded, but in many other cases where a testcase is half a step from a full exploit it’s vital; mrbkap can elaborate on this at length.) The testcase is often committed after the release, particularly when the problem is memory corruption in ways that appear difficult to control (the apparent case here), but by that time the damage has already been done.

Automated tests are not a panacea, as my original fix shows. Nevertheless, consistent use of them here would have at least eliminated the 3.0.11 regression if not the trunk regression. For this to happen we must have security bugs include automated tests that run, essentially, as soon as the fix lands. The solution that I believe best meets this need as I envision it consists of one separate, private Mecurial repository per actively maintained release to which security bug tests may be committed. Access to these repositories would be strictly limited to developers with access to security bugs plus accounts for use by tinderboxen. We would then add additional steps to the build process on those tinderboxen to pull and run the security tests, reporting either PASS or FAIL for the lot of them. Detailed information would not be publicly displayed but would be available in some unspecified manner to developers who cause security bug tests to fail. Then, as security bugs are fixed, we would require two-phase commits for security bugs: first the fix to the public repository, then the test to the restricted repository. As failures would turn the main tinderbox orange, this ensures regressions get attention between the initial commit to the security repository and their eventual migration to the main repository — early enough to make the difference in cases like this. Another plus: I believe this would dovetail nicely with work in bug 383136 to make it possible to run tests against prepackaged builds.

There may be a different solution which meets the needs I specify here; I’m less concerned about the process than about results. However, I haven’t heard another proposal that I believe would work well enough.

Manual testing: actually, I don’t know what the lesson is here

Third, we have manual testing, imperfect but still necessary to some degree. I don’t know what the lesson for QA and manual testing is. Sure, they could be more diligent about checking that a bug exists pre-commit and is fixed post-commit, but in isolation every such action is reasonable. Is the decrease in available time worth the benefit of potentially catching a problem like here every so often? I don’t know the exact processes they follow these days or how they otherwise spend their time, so I really can’t evaluate this. I’ll let QA consider this situation and decide how to adjust, because I’m certainly not qualified to do so.


The particular bugs at issue here are now fixed, so for the moment we’re back to steady state. As explained above, however, it’s possible that further errors might happen for the same basic reasons unless we make an effort to eliminate those reasons, and we still have work to do to fix the root problems. I have hope that this article may spur improvements in processes that will make mistakes like this much harder to make, but it will take more than just me to make these changes happen. In the meantime, to Firefox 3.0.11 users, enjoy the gift of an unintended Acid3 point; it came at a much higher cost than we’d have been willing to pay if we had known the full story and never made either 3.0.x backport.


Yet another post which planet.mozilla.org and web-tech readers may ignore

Tags: , , , — Jeff @ 17:16

I wrote another web-tech blog post:

Object and array initializers should not invoke setters when evaluated

This post is duplicative if you follow other Mozilla news sources, and if you don’t and aren’t a web developer you probably won’t have the background to understand much of the content at the link, but it’s still quite representative of some of what I’ve worked on recently if you happen to be interested and hadn’t otherwise encountered the article.