25.03.10

21.03.10

18.03.10

Whole-text DOM functionality and Acid3 redux

Tags: , , , , , , — Jeff @ 11:17

(If you must know immediately why this post is happening now rather than a couple years ago, see the last paragraph of this post.)

In September 2008 I wrote a web tech blog post about Text.wholeText and Text.replaceWholeText. These are two DOM APIs which I implemented in Gecko before I graduated from MIT and took five months to thru-hike the Appalachian Trail. Implementing whole-text functionality was an interesting little bit of hacking, done in an attempt to pick up as many easy Acid3 points as possible for Firefox 3, with as little effort as possible. The functionality didn’t quite make 3.0, but aside from the missed point I think that mattered little.

The careful reader might think the post contains a slight derision for Text.wholeText and Text.replaceWholeText — and he would be right to think so. As I note in the last paragraph of the post, Node.textContent (or in the real world of the web, innerHTML) is generally better-suited for what you might use Text.wholeText to implement. In those situations where it isn’t, direct DOM manipulation is usually much clearer.

The whole-text approach of Text.wholeText and Text.replaceWholeText is arcane. Its relative usefulness is an artifact of the weird way content is broken up into a DOM that can contain multiple adjacent text nodes, in which node references persist across mutations. It is an approach motivated by fundamental design flaws in the DOM: Text.wholeText and Text.replaceWholeText are a patch, not new functionality. Further, Text.replaceWholeText‘s semantics are complicated, so it’s not particularly easy to use it to good effect. (Note the rather contorted example I gave in the post.)

Fundamentally, the only reason I implemented whole-text functionality is because it was in Acid3. I believe this is the only reason WebKit implemented it, and I believe it is quite probably the only reason other browser engines have implemented it. This is the wrong way to determine what features to implement. Features should be implemented on the basis of their usefulness, of their “aesthetics” (an example lacking such: shared-state threads with manual locks, rather than shared-nothing worker threads with message passing), of their ability to make web development easier, and of what they make possible that had previously been impossible (or practically so). I know of no browser engine that implemented whole-text functionality because web developers demanded it. Nevertheless, its being in a well-known test mandated its implementation; in an arms race, cost-benefit analysis must be discarded. (The one bright spot for Mozilla: in contrast to at least some of their competitors, they didn’t have to spend money, or divert an employee, contractor, or intern already more productively occupied, to implement this — beyond review time and marginal overhead, at least.)

The requirement of whole-text functionality, despite its non-importance, is one example of what I think makes Acid3 a flawed test. Acid3 went out of its way to test edge cases. Worse, it tested edge cases where differences posed little cost for web developers. Acid3 often didn’t test things web authors wanted, but instead it tested things that were broken or not implemented regardless whether anyone truly cared.

The other Acid3 bugs I fixed were generally just as unimportant as whole-text functionality. (Due to time constraints of classes and graduation, this correlation shouldn’t be very surprising, of course, but each trivial test was a missed opportunity to include something developers would care about.) Those bugs were:

The UTF-16 bug was exactly the sort of thing to test, especially for its potential security implications; disagreement here is frankly dangerous. (Still, I remain concerned that third-party specification inexactness caused Acid3 to permit several different semantics, listed beneath “it would be permitted to do any of the following” in Acid3‘s source. This concern will be addressed in WebIDL, among other places, in the future.) cursor:none was an arguably reasonable test, but it probably wasn’t important to web developers because it had a trivial workaround: use a transparent image. (The same goes for other unrecognized keywords, if with less fidelity to the user’s browser conventions, therefore lending the testing of these keywords greater reasonableness.) But the other tests are careful spec-lawyering rather than reflections of web author needs. (This is not to say that spec-lawyering is not worthwhile — I enjoy spec-lawyering immensely — but the real-world impact of some non-compliance, such as the toString example noted below, is vanishingly small.) Nitpicking the exact exceptions thrown trying to create elements with patently malformed names doesn’t really matter, because in a world of HTML almost no one creates elements with novel names. (Even in the world of XML languages, element names are confined to the vocabulary of namespaces.) Effectively no one uses Element.attributes, and the removeNamedItemNS method of it even less, preferring instead {has,get,set}Attribute{,NS}. The bug in question — that null was returned rather than an exception being thrown for non-existent attributes — was basic spec compliance but ultimately not useful function for web developers. Similarly, the impact of an incorrect difference between (3.14).toString() and (3.14).toString(undefined) is nearly negligible. The escape-parsing bug was an interesting quirk, but since other browsers produced a syntax error it had little relevance for developers. All these issues were worth fixing, but should they have been in Acid3? How many developers salivated in anticipation of the time when eval("var v\\u0020 = 1;") would properly throw a syntax error?

Other Acid3-tested features fixed by others often demonstrated similar unconcern for real-world web authoring needs. (NB: I do not mean to criticize the authors or suggesters of mentioned tests [I’m actually in the latter set, having failed to make these opinions clear at the time]; their tests are generally valid and worth fixing. I only suggest that their tests lacked sufficient real-world importance to merit inclusion in Acid3.) One test examined support for getSVGDocument(), a rather ill-advised method on frames and objects added by the SVG specification, whose return value, it was eventually determined (after Acid3-spawned discussion), would be identical to the sibling contentDocument property. Another examined the values of various properties of DocumentType nodes in the DOM, notwithstanding that web developers use document types — at source level only, not programmatically — almost exclusively for the purpose of placing browser engines in standards mode. Not all tested features were unimportant; one clear counterexample in Acid3, TTF downloadable font support, was well worth including. But if Acid3 gave web authors that, why test SVG font support? (Dynamically-modifiable fonts don’t count: they’re far beyond the bounds of what web authors might use regularly.) SVG font use through CSS was an after-the-fact rationalization: SVG fonts were only intended for use in SVG. (If one wanted to write an acid test specifically for SVG renderers, testing SVG font support at the same time might be sensible. Acid3, despite its inclusion of a few SVG tests, was certainly not such a test.)

But Acid tests don’t have to test trivialities! Indeed, past Acid tests usefully prodded browsers to implement functionality web developers craved. I can’t speak to the original as it was way before my time, but Acid2 did not have these shortcomings. The features Acid2 tested were in demand among web authors before the existence of Acid2, a fortiori desirable independent of their presence in Acid2.

I have hope Acid4 will not have these shortcomings. This is partly because the test’s author recognizes past errors as such. With the advent of HTML5 and a barrel of new standards efforts (workers, WebGL, XMLHttpRequest, CSS animations and transitions, &c. to name a few that randomly come to mind), there should be plenty of useful functionality to test in future Acid tests without needing to draw from the dregs. Still, we’ll have to wait and see what the future brings.

(A note on the timing of this post: it was originally to be a part of my ongoing Appalachian Trail thru-hike posts, because I wrote the web tech blog post on whole-text functionality during the hike. However, at the request of a few people I’ve separated it out into this post to make it more readable and accessible. [This post would have been in the next trail update, to be posted within a week.] This post would indisputably have been far more timely awhile ago, but I write only as I have time. [I wouldn’t even have bothered to post given the delay, but I have a certain amount of stubbornness about finishing up the A.T. post series. Since in my mind this belongs in that narrative, and as I’ve never omitted a memorable topic even if (if? —ed.) it interested no one but me, I feel obliged to address this even this far after the fact.] Now, if you skipped this post’s contents for this explanation, return to the start and read on.)

17.03.10

10.03.10

Dear Bugzillazyweb

Tags: , , , , — Jeff @ 12:18

It would be helpful, in terms of evaluating review request responsiveness, to have a way to look at a list of all bugs in a particular period of time in which a review has been requested of me, then either granted by someone else, transferred to someone else by the patch’s author, or removed due to a newer attachment being posted with review directed at someone else. The basic idea is to figure out how to measure whether other people are switching to other reviewers due to review latency, when requesters are sufficiently knowledgeable/motivated to switch rather than have an old review sit in a request queue forever. There’s no precise way to measure exactly this statistic. Someone else granting a review request might just be that that person was marginally more responsive on IRC to a quick request made after the initial flagging in Bugzilla. A review transfer may have been done with consent of both parties for reasons unrelated to review delay. A newer attachment with different reviewer might be an acknowledgment of a patch’s changing scope (whose most competent reviewer therefore changed). The point isn’t to get an exact idea, just to give the list of bugs so that one could examine the list, manually filter out false positives, and get some sort of rough idea of how good or bad review responsiveness has been.

A sufficiently granular bugmail search could probably tell me this, but I suspect extracting that information from lightly-structured text is much harder than working on Bugzilla or its data directly, and I’m not sure if my email account could easily accommodate such a search (and that solution’s not generally applicable).

So, lazyweb…make my life easier for me. 🙂

« NewerOlder »