01.12.09

An exercise in XPCOM stream programming

If you’ve done any programming with XPCOM, at some time you’ve probably had to work with streams. A little background in case you haven’t, then a small thought exercise:

Streams

A stream is an object from which you read data or to which you write data. In XPCOM an input stream stream is a stream from which you read data; an output stream is a stream to which you write data. In an ideal world a stream is either open (indicating data may be read or written to it) or closed (indicating that the stream is no longer readable, or that no more data can be written to it), and that’s all there is to it. File objects in Python function very much like ideal streams.

In the real world, truly useful streams have further limitations (or characteristics). How much data can be read from an input stream right now? Can a given amount of data be written to an output stream right now? Should reading or writing proceed until completion when right now isn’t possible but sometime later might be, or should it halt immediately with an error indicating that reading or writing would block program execution? One might ignore these concerns in simplistic scenarios such as those which short Python scripts might be used to address. In complex applications, particularly those which must remain responsive to user input, these concerns may be quite important. You can’t display a useful progress bar if the stream you’re reading from represents the download of a 3GB file over a slow network and reading from the stream blocks program execution.

Streams which immediately halt with an error when reading or writing would block execution are nonblocking streams. Efficient use of such streams requires a way to wait until the desired amount of data can be written to or read from a stream. XPCOM efficiently supports nonblocking streams through an asyncWait method which will notify at some later time when the desired amount of data can be written to or read from the stream, without blocking. At the moment there are two flavors of asynchronous waiting: waiting until the desired amount can be read or written, and waiting until the given stream has been closed. At the interface level, the former is indicated by flags = 0, while the latter is indicated by flags = WAIT_CLOSURE_ONLY.

The Exercise

Suppose you wish to complete one conceptually simple task in stream programming: copying a stream, i.e. reading all data from one stream and writing it all into another, where both streams are nonblocking. (Such a copier might buffer data read before it can be immediately written; assume this is a requirement for the purposes of this exercise.) Suppose for the moment that there is no readily available implementation of the nsIAsyncStreamCopier interface, so you have to roll your own stream copier. In what situation is it necessary to asynchronously wait with flags = WAIT_CLOSURE_ONLY to efficiently implement stream copying?

Hints

If you want a hint (arguably the answer, if you can interpret the code), take a look at the uses of WAIT_CLOSURE_ONLY in xpcom/io/nsStreamUtils.cpp. You may perhaps find further hints in bug 513854, the bug which brought this somewhat quirky need for flags = WAIT_CLOSURE_ONLY to my attention.

Questions?

I come to this problem with more experience and familiarity with streams than most people will have. If anything in the above description is unclear, ask questions in the comment section — I did the best I could to make the problem and its background understandable, but I may easily have done so less well than intended.

07.11.09

I know what you Googled this summer, last summer, and the summer before (but not much before then)

Tags: , , , , — Jeff @ 17:09

Google collects a lot of information about its users. Or, more accurately, users give an awful lot of information to Google. (If you hadn’t guessed, I have little sympathy for people who complain about Google invading their privacy: if you don’t like the ways Google can use the information you give it, don’t use Google.) It’s therefore not surprising Google comes in for a good share of complaints about its “invasions of privacy” or some similar alarmism. Recently I stumbled across mention of one service Google now provides to give users insight into what information Google tracks about them: Google Dashboard, a one-stop shop directing you to modifiable views of much of the information Google has recorded about your interactions with it. It currently covers these Google services:

  • General account details (password, email address, &c.)
  • Alerts
  • Calendar
  • Contacts
  • Docs (& Spreadsheets)
  • Gmail
  • iGoogle
  • Orkut
  • Product Search
  • Profile (the link you see at the top of results if you search for a person who’s created one and made it publicly available)
  • Reader
  • Talk
  • Web History
  • YouTube

These services “are not yet available in this dashboard”:

  • Google App Engine
  • Google Groups
  • Google Book Search
  • Google Subscribed Links
  • Google WiFi

Skimming through the data yields this information about me, at a general level:

  • Searching (since May 12, 2007):
    • Total searches: 16026 (speculation on where that puts me overall by searches/day? I’m guessing top 5%, probably an even smaller percentage)
    • Total sponsored results viewed: 23
    • Total sponsored results viewed from searches with no intention of buying anything (i.e. I searched to learn information not meant for my potential use in making a purchase): 17
    • Total sponsored results which resulted in purchases: definitely 1, maybe 2 depending how broadly you define “purchase”, possibly 3 if you count one as minimally contributing to an eventual purchase that was ultimately made based on recommendations from friends
    • Total sponsored results clicked resulting in purchases not previously planned: 0
    • I’d always thought advertising basically doesn’t work on me; this seems like solid numerical evidence of that
  • I basically haven’t touched my calendar in over two years (not surprising, I’ve never had success keeping and regularly using a calendar)
  • I’ve created two docs/spreadsheets (one to track acid3 progress, one to track shared apartment/utility/etc. expenses with Jesse)
  • I have 12450 conversations in Gmail (most of it just archival storage of my college dorm’s mailing list, some other mail I’ve mostly ignored)
  • I have a tab and a theme in iGoogle, which I basically never use (prefer Ctrl+K in Firefox, or the non-customized home page)
  • I have one album in Orkut with nothing in it (probably auto-generated in the days when I was thinking of investigating Orkut’s JavaScript sandboxing implementation like I did Facebook’s; the account’s otherwise dormant)
  • I have four items in a Google shopping list, all dating back almost five years ago, all of which I still don’t have (“need” is far too strong a word for any of them)
  • I have 61 Reader subscriptions
  • No contacts in Talk, not even sure I’ve used it since it first came out
  • My YouTube account information until just now claimed I still live in Cambridge, MA

Of course, the search part is the most interesting bit, but there’s still a little gravy for me in the data on the other services. Does Dashboard reveal anything interesting to you about your interactions with Google?

Edit: Something else worth noting, after further exploration: their current UI for examining manual route changes in maps is clearly more prototyped than polished. It appears that every route change shows up as its own “search” in the map history UI, which results in dozens of “searches” showing up for viewing a single set of directions and modifying them to reflect some other choice of roads. (Except when I merely want to place a location on a map, I change the automatically-determined route nearly every time because I can’t bike on freeways like US-101, and nearly every generated route traveling up or down the peninsula uses it.)

23.10.09

pbcopy and pbpaste for Linux

Tags: , , , , , , — Jeff @ 14:43

Mac OS X has the useful commands pbcopy and pbpaste. pbcopy reads the contents of standard input into the clipboard; pbpaste writes the contents of the clipboard to standard output. These commands aren’t part of the standard set of commands on Linux, but they’re easily added. Simply install the XSel program via a package management system or directly from source, then add these lines to ~/.bashrc. Voilà! Easy commandline access to the clipboard.

alias pbcopy='xsel --clipboard --input'
alias pbpaste='xsel --clipboard --output'

16.10.09

Working on the JS engine, redux

In a continuation of a topic started by mrbkap, I present you this gem of a gdb command I needed to use today:

cond 8 \
  (*$9 && \
   ((*$9)->id&7) == 4 && \
   (*(jschar**)((uintptr_t) (*$9)->id + 4))[0] == 'o' && \
   (*(jschar**)((uintptr_t) (*$9)->id + 4))[1] == 'f' && \
   (*(jschar**)((uintptr_t) (*$9)->id + 4))[2] == 'f')

For minimal background, breakpoint 8 was the result of watch *$9, and of course $9 = (JSScopeProperty **) 0x7ffff2f08038.

By the way, did you know the gdb command line supports line continuations? I didn’t, before I had to think about how the above command would display without any. 🙂 This is the above command as I originally wrote it:

cond 8 (*$9 && ((*$9)->id&7) == 4 && (*(jschar**)((uintptr_t) (*$9)->id + 4))[0] == 'o' && (*(jschar**)((uintptr_t) (*$9)->id + 4))[1] == 'f' && (*(jschar**)((uintptr_t) (*$9)->id + 4))[2] == 'f')

15.10.09

My distro can beat up your distro’s honor student. Or something like that. (Or: setting up ccache-powered Firefox builds in Fedora)

Tags: , , , , , , — Jeff @ 22:23

dholbert makes a recent post (well, recent only in planet.mozilla.org‘s little mind, no idea why a post from September 2008 is being displayed as new!) discussing how to build Firefox with ccache on Ubuntu, saving compilation time on close to null-program rebuilds. Cool beans. However:

If you’re on Fedora 11 (conceivably earlier too, I regretfully haven’t regularly used Fedora since Fedora 6, until recently), the basic developer tools package combo includes ccache, and caching Just Works in Firefox builds with no extra work needed at all.

[jwalden@the-great-waldo-search dbg]$ \
> ls -la `which g++` `which c++` `which gcc` /usr/bin/ccache
-rwxr-xr-x. 1 root root 43584 2009-02-23 23:42 /usr/bin/ccache
lrwxrwxrwx. 1 root root    16 2009-10-02 21:29 /usr/lib64/ccache/c++ -> ../../bin/ccache
lrwxrwxrwx. 1 root root    16 2009-10-02 21:29 /usr/lib64/ccache/g++ -> ../../bin/ccache
lrwxrwxrwx. 1 root root    16 2009-10-02 21:29 /usr/lib64/ccache/gcc -> ../../bin/ccache
[jwalden@the-great-waldo-search dbg]$ du -hs ~/.ccache
883M	/home/jwalden/.ccache

Anyway, use whichever distro you want, with ccache or without, whatever satisfies your preferences and utility curve. (The semi-troll title is completely gratuitous, but my sense of humor mandated I use it. 🙂 ) As for me: I am an absolute sucker for convenience. I’ve known of ccache for years and never used it before due to the activation energy needed to do so; had using ccache required equivalent effort in Fedora I strongly doubt I’d ever have used it. Score one for making the right choice for the user rather than requiring him to make it himself.

« NewerOlder »