<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Where&#039;s Walden? &#187; data mining</title>
	<atom:link href="http://whereswalden.com/tag/data-mining/feed/" rel="self" type="application/rss+xml" />
	<link>http://whereswalden.com</link>
	<description>Mozilla, politics, economics, law, backpacking, cycling, and other random desiderata</description>
	<lastBuildDate>Tue, 07 Sep 2010 20:10:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>I know what you Googled this summer, last summer, and the summer before (but not much before then)</title>
		<link>http://whereswalden.com/2009/11/07/i-know-what-you-googled-this-summer-last-summer-and-the-summer-before-but-not-much-before-then/</link>
		<comments>http://whereswalden.com/2009/11/07/i-know-what-you-googled-this-summer-last-summer-and-the-summer-before-but-not-much-before-then/#comments</comments>
		<pubDate>Sun, 08 Nov 2009 00:09:14 +0000</pubDate>
		<dc:creator>Jeff</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[history]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://whereswalden.com/?p=1056</guid>
		<description><![CDATA[Google collects a lot of information about its users. Or, more accurately, users give an awful lot of information to Google. (If you hadn&#8217;t guessed, I have little sympathy for people who complain about Google invading their privacy: if you don&#8217;t like the ways Google can use the information you give it, don&#8217;t use Google.) [...]]]></description>
			<content:encoded><![CDATA[<p>Google collects a lot of information about its users.  Or, more accurately, users give an awful lot of information to Google.  (If you hadn&#8217;t guessed, I have little sympathy for people who complain about Google invading their privacy: if you don&#8217;t like the ways Google can use the information you give it, don&#8217;t use Google.)  It&#8217;s therefore not surprising Google comes in for a good share of complaints about its &#8220;invasions of privacy&#8221; or some similar alarmism.  Recently I stumbled across mention of one service Google now provides to give users insight into what information Google tracks about them: <a href="https://www.google.com/dashboard/">Google Dashboard</a>, a one-stop shop directing you to modifiable views of much of the information Google has recorded about your interactions with it.  It currently covers these Google services:</p>
<ul>
<li>General account details (password, email address, <abbr lang="la" title="et cetera, and so on">&#038;c.</abbr>)</li>
<li>Alerts</li>
<li>Calendar</li>
<li>Contacts</li>
<li>Docs (&amp; Spreadsheets)</li>
<li>Gmail</li>
<li>iGoogle</li>
<li>Orkut</li>
<li>Product Search</li>
<li>Profile (the link you see at the top of results if you search for a person who&#8217;s created one and made it publicly available)</li>
<li>Reader</li>
<li>Talk</li>
<li>Web History</li>
<li>YouTube</li>
</ul>
<p>These services &#8220;are not yet available in this dashboard&#8221;:</p>
<ul>
<li>Google App Engine</li>
<li>Google Groups</li>
<li>Google Book Search</li>
<li>Google Subscribed Links</li>
<li>Google WiFi</li>
</ul>
<p>Skimming through the data yields this information about me, at a general level:</p>
<ul>
<li>Searching (since May 12, 2007):
<ul>
<li>Total searches: 16026 (speculation on where that puts me overall by searches/day?  I&#8217;m guessing top 5%, probably an even smaller percentage)</li>
<li>Total sponsored results viewed: 23 </li>
<li>Total sponsored results viewed from searches with no intention of buying anything (<abbr lang="la" title="id est, that is">i.e.</abbr> I searched to learn information not meant for my potential use in making a purchase): 17</li>
<li>Total sponsored results which resulted in purchases: definitely 1, maybe 2 depending how broadly you define &#8220;purchase&#8221;, possibly 3 if you count one as minimally contributing to an eventual purchase that was ultimately made based on recommendations from friends</li>
<li>Total sponsored results clicked resulting in purchases not previously planned: 0</li>
<li>I&#8217;d always thought advertising basically doesn&#8217;t work on me; this seems like solid numerical evidence of that</li>
</ul>
</li>
<li>I basically haven&#8217;t touched my calendar in over two years (not surprising, I&#8217;ve never had success keeping and regularly using a calendar)</li>
<li>I&#8217;ve created two docs/spreadsheets (one to track <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=acid3">acid3</a> progress, one to track shared apartment/utility/etc. expenses with <a href="https://www.squarefree.com/">Jesse</a>)</li>
<li>I have 12450 conversations in Gmail (most of it just archival storage of my college dorm&#8217;s mailing list, some other mail I&#8217;ve mostly ignored)</li>
<li>I have a tab and a theme in iGoogle, which I basically never use (prefer <kbd>Ctrl</kbd>+<kbd>K</kbd> in Firefox, or the non-customized home page)</li>
<li>I have one album in Orkut with nothing in it (probably auto-generated in the days when I was thinking of investigating Orkut&#8217;s JavaScript sandboxing implementation <a href="http://stuff.mit.edu/iap/facebook/">like I did Facebook&#8217;s</a>; the account&#8217;s otherwise dormant)</li>
<li>I have four items in a Google shopping list, all dating back almost five years ago, all of which I still don&#8217;t have (&#8220;need&#8221; is far too strong a word for any of them)</li>
<li>I have 61 Reader subscriptions</li>
<li>No contacts in Talk, not even sure I&#8217;ve used it since it first came out</li>
<li>My YouTube account information until just now claimed I still live in Cambridge, <abbr title="Massachusetts">MA</abbr></li>
</ul>
<p>Of course, the search part is the most interesting bit, but there&#8217;s still a little gravy for me in the data on the other services.  Does Dashboard reveal anything interesting to you about your interactions with Google?</p>
<p><ins>Edit: Something else worth noting, after further exploration: their current <abbr title="user interface">UI</abbr> for examining manual route changes in maps is clearly more prototyped than polished.  It appears that every route change shows up as its own &#8220;search&#8221; in the map history UI, which results in dozens of &#8220;searches&#8221; showing up for viewing a single set of directions and modifying them to reflect some other choice of roads.  (Except when I merely want to place a location on a map, I change the automatically-determined route nearly every time because I can&#8217;t bike on freeways like <abbr title="United States">US</abbr>-101, and nearly every generated route traveling up or down the peninsula uses it.)</ins></p>
]]></content:encoded>
			<wfw:commentRss>http://whereswalden.com/2009/11/07/i-know-what-you-googled-this-summer-last-summer-and-the-summer-before-but-not-much-before-then/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>
