Outputting to CSV in Postgresql

I was inspired to write my own blog post on generating CSV output in Postgresql after researching the topic and finding numerous posts with wrong answers. I hope that this post will be useful to others and also to myself the next time I want to create CSV output in Postgresql.

If you just want a CSV dump of an entire table and order is not important, then run:

psql -c "COPY TABLE_NAME TO STDOUT WITH CSV HEADER " > CSV_FILE.csv

where TABLE_NAME is the name of the table you want to dump.

If you can also dump the results of a more complicated query as follows:

psql -c "COPY ( QUERY ) TO STDOUT WITH CSV HEADER " > CSV_FILE.csv

where QUERY is the query you wish to dump. E.g.

psql -c "COPY ( SELECT * FROM TABLE ORDER BY id limit 10 ) TO STDOUT WITH CSV HEADER " > CSV_FILE.csv

What Not To Do

The typical naive suggestions involve attempting to generate CSV output using basic SQL. These approaches will generate broken CSV files in many cases such as when fields contain quotes and commas.  E.g.

psql -A -F ‘,’ -c ‘SELECT * from TABLE limit 10’ > CSV_FILE.csv

A CSV file (despite the name) is not simply a bunch of values separated by values. Generating proper CSV output requires handling a number of complicated corner cases. For example, you must handle the cases in which the values contain commas. Typically this is done by quoting these values. We might try quoting every field, but then what about fields that contain quotes? The point is that, generating proper CSV output is not something you’re going to be able to using a query standard SQL without incredible complexity (if at all).  Built in CSV output functionality exists for a reason. Use it!

 

Credits:

Many of the blog posts I found that suggested incorrect Postgresql CSV generation techniques contained comments describing the correct approach.

One such comment is listed below:

http://pookey.co.uk/wordpress/archives/51-outputting-from-postgres-to-csv#comment-206

Essential Android Apps

A friend of mine recently got an Android phone and ask me which apps I recommended. After putting together a list of essential Android apps for my friend, I realized that realized to the list might be of interest to the wider Internet community. So, I’ve decided to share it here. For each app, I’ve included a short description of it followed by a link to the Android market place which will allow you to install it directly onto your phone. I decided not to include apps like Google Maps since most phones come with it already installed and I wanted this list to help people find new apps to install.

MyTracks is one app I highly recommend. It uses the GPS to track your location and then presents you with both a sharable map showing where you’ve travelled and statistics such as total distance, average speed, max speed, max elevation, etc. It’s especially nice to use during a hike or walk so that you know how far you’ve walked and where you’ve been.
https://market.android.com/details?id=com.google.android.maps.mytracks&feature=search_result#?t=W251bGwsMSwxLDEsImNvbS5nb29nbGUuYW5kcm9pZC5tYXBzLm15dHJhY2tzIl0.

Another good app is Google Goggles which, among other things, will allow your phone to read those increasingly ubiquitous QR codes.
https://market.android.com/details?id=com.google.android.apps.unveil&feature=search_result#?t=W251bGwsMSwxLDEsImNvbS5nb29nbGUuYW5kcm9pZC5hcHBzLnVudmVpbCJd

If you listen to podcasts or want to start getting into them, Google Listen is worth trying. It does a fairly good job of managing and downloading podcasts.
https://market.android.com/details?id=com.google.android.apps.listen&feature=search_result#?t=W251bGwsMSwxLDEsImNvbS5nb29nbGUuYW5kcm9pZC5hcHBzLmxpc3RlbiJd

OI Shopping List — This is a simple shopping list application but one I use quite frequently. Whenever I’m low or out of a particular item such as paper towels, I start OI Shopping List and add the item to my list. That way, the next time I’m at the store I just pull out my phone and get reminded about the item. Occasionally, I will also use it to make sure that I obtain all the ingredients for a complex meal.
https://market.android.com/details?id=org.openintents.shopping&feature=search_result#?t=W251bGwsMSwxLDEsIm9yZy5vcGVuaW50ZW50cy5zaG9wcGluZyJd

Kitchen Timer is a simple but useful app. As the name implies it’s basically a kitchen timer. One nice feature is that it lets you have 3 independent timers on one screen.
https://market.android.com/details?id=com.leinardi.kitchentimer&feature=search_result#?t=W251bGwsMSwxLDEsImNvbS5sZWluYXJkaS5raXRjaGVudGltZXIiXQ..

Amazon MP3 — This app is often installed by default on phones so you may already have it. If you choose to purchase music over the Internet as opposed to on CD, Amazon is a reasonably good choice. Their MP3 store is fairly complete and their prices are competitive. An added feature of Amazon is that your music is available in the Amazon Cloud player. Additionally their MP3s are DRM free.
https://market.android.com/details?id=com.amazon.mp3&feature=search_result#?t=W251bGwsMSwxLDEsImNvbS5hbWF6b24ubXAzIl0.

Finally, the regular Amazon app can be useful. Among other features, you can scan the bar codes of an item to get the Amazon product page. The ethics walking into a book store and price comparing with Amazon are very debatable and not something I’ll discuss here. But a very legitimate use of the Amazon App is to read product reviews. Reading product reviews on Amazon will often give you more information than simply reading a book’s dust jacket and flipping through it. Similarly, reading Amazon reviews is likely to be much more useful than simply reading the box of an electronic item.
https://market.android.com/details?id=com.amazon.mShop.android&feature=search_result#?t=W251bGwsMSwxLDEsImNvbS5hbWF6b24ubVNob3AuYW5kcm9pZCJd

Selectively Sharing Facebook vs. Google+ – Part 2 Selectively Sharing on Google+

Selectively Sharing on Google+

This is the second part of a 3 part blog post in which I compare selectively sharing content on Facebook and Google+. In this post, I show how to selectively share content on Google+.  This example assumes that you have created a circle called ‘Close Friends’.

Step 1. Log on Google+ and type your message.

 

Step 2: Remove the default circles by clicking the ‘x’ on each one.

 

 

Step 3. Click the box that says ‘+Add circles or people to share with…’

 

 

Step 4. Since ‘Close friends’ doesn’t initially show up in the drop down menu, we need to click  ’10 more…’.  (If ‘Close friends’ had appeared we could have skipped to the next step.)

 

 

Step 5. Click ‘Close friends’ in the drop down menu.

 

 

Step 6. Verify that the message is being shared with the people that you expected by glancing at the circles in the box under the message text.  After, you’ve confirmed that the message is being shared with the right circle(s), click ‘share’ to publish it.

 

Step 7. After the message has been published, you can check who it’s being shared with by clicking the text that says ‘Limited’.

 

Step 8. Google+ pops up a box showing small profile pictures of everyone with whom the post is shared. Mousing over these small pictures causes Google+ to show the person’s name and a larger picture.

Note: As far as I was able to determine, once a post is shared, there is no way to change which users are able to see it. Thus if you shared with the wrong set of users, the only thing to do is delete and repost.

 

Selectively Sharing Facebook vs. Google+ – Part 1 Selectively Sharing on Facebook

Selectively Sharing on Facebook

This is the first part of a 3 part blog post in which I compare selectively sharing updates on Facebook and Google+. In this post, I show how to selectively share content on Facebook.

Here are the steps for selectively sharing status updates on Facebook. This example, assumes that you know how to create a friends list in Facebook.  We have previously created a friends list called ‘close friends’ and we will create a post that is only visible to members of that list.

 

1.  Load Facebook in your browser

 

2. Type the status message that you want to selectively share

 

 

3.) Click the lock icon under the message

 

4.) Select Customize from the drop down menu

 

5.) A box titled ‘Custom Privacy’ will pop up. There will be a drop under ‘Make this visible to’ next to ‘These people’. Select ‘Specific People…’ from the drop down menu.

 

 

6.) In the text box that appears, begin typing the name of the friends list that you want to share the post with. In this example, we have a friends list called ‘close friends’. Facebook will attempt to auto-complete the list name.

 

 

 

7.) Click save settings

 

8.) The visibility settings for the post will be updated so that it’s only visible by the members of the list you specified. However, it’s a good idea to mouse over the lock icon to verify that the settings are what you expected before sharing the post.

 

 

9.) After you’ve verified the privacy settings, click share. The post is now published to your profile.  If you want to verify its privacy settings later, you can hover over the lock icon.

 

Be aware that once something is shared to your profile, it’s no longer possible to change its privacy settings. If you discover that you’ve made a mistake — for example you shared a posting with the wrong group of friends — you’ll need to delete what you shared and repost it with different visibility settings.

(Note: You can, however, change the members of the friends list itself as a way of changing who has access to an already posted item. But it’s usually easier that to delete the post and re-post with a different privacy settings.)

 

Using PostgreSQL for an Ajax Autocomplete Field

Using PostgreSQL for an Ajax Autocomplete Field

I recently implemented an Ajax based autocomplete field for a web application using a PostgreSQL database on the back-end. I had trouble finding appropriate PostgreSQL queries, so I’ve decided to share my approach in the hope that it will be useful to others.

Data Format

The application uses the following PostgreSQL table to store the author data:

CREATE TABLE authors (
authors_id serial PRIMARY KEY,

author_name character varying UNIQUE NOT NULL
);

The author_name field represents the full name of an author. E.g. ‘jane austen’, ‘john smith’, ‘john q. public’, ‘william shakespeare’, etc.

Client Side JavaScript

On the client side, the application uses the jQuery UI Autocomplete widget.(http://jqueryui.com/demos/autocomplete/) jQuery is a great library and if you’re creating dynamic web pages and not using it, you really should be. However, jQuery has been well documented so I’m not going to discuss the details of the client side implementation. Additionally, the back-end code is general enough that it should work with other JavaScript libraries with minimal modifications. Essentially, when the user begins typing, the jQuery UI Autocomplete widget makes a get request to the server with the text the user entered and expects the server to respond with a json object containing the results to show the user.

Initial Implementation

Creating the initial version of the autocomplete field was relatively straightforward. To handle the Ajax we simply added the following method to our controller:

use JSON;
sub json_author_search : Local
{
my ( $self, $c, $dashboards_id ) = @_;
my $term = $c->req->param( 'term' ) || 0;
$term = '%' . $term . '%';
my $terms =
$c->dbis->query( "select authors_id, author_name as label from authors where author_name ilike ? ", $term )->hashes;
$c->res->body( encode_json( $terms ) );
return;
}

This code was fully functional and worked with small data sets on a development machine. However, once we started testing on our production machine with a large dataset, it became clear that there were performance problems.

Using an explain showed that PostgreSQL was doing a sequential scan. I.e. it was reading through the entire database table for every query:

mediacloud=# explain select authors_id, author_name as label from authors where author_name ilike '%bla%';
QUERY PLAN
---------------------------------------------------------
Seq Scan on authors (cost=0.00..37.33 rows=1 width=22)
Filter: ((author_name)::text ~~* '%bla%'::text)
(2 rows)

Possible Solutions

I did some web searches looking for a solution. I found Nikolay Samokhvalov’s slides on Using PostgresSQL In Web 2.0 Applications.(http://www.scribd.com/doc/4844027/Using-PostgreSQL-In-Web-20-Applications-)
These slides have useful information and are worth reading. However, their suggested approaches focus on key word search rather than autocomplete. The first approach that Samokhvalov gives is to use prefix search. This solution uses like ‘bla%’ queries and relies on text_pattern_ops indexes for quick performance. Samokhvalov’s other tip is to use lower instead of using ilike. So in our case we would add the following index:

CREATE INDEX author_name_prefix ON authors USING btree ( lower (author_name) text_pattern_ops);

and change our search query to:

$term = ‘%’ . $term;
$c->dbis->query( "select authors_id, author_name as label from authors where lower(author_name) like ? ", $term )->hashes;

This approach is fast but it doesn’t give us the behavior that we wanted. For example, if the user typed ‘sm’, we would want ‘john smith’ to be included in the search results. However, “john smith” would only be matched if the user instead typed ‘jo’.

Another solution that Samkhvalov suggests is using GIN indexes with full text search. This is an interesting approach but it didn’t seem like the right fit for an autocomplete service. As far as I was able to determine, PostgreSQL’s full text search is only designed to match entire words so ‘sm’ would not match ‘smith’. Finally Samkhvalov suggests an interesting hybrid approach that uses both prefix search and full text search with GIN indexes. This approach relies on having a separate table that contains all the distinct words in every string in the table that’s being searched. So in our case we would have an author words table that contained words such as ‘alice’,’bob’,’brown’,’david’,’smith’,etc. This approach might work well for relatively static data — creating the initial table of distinct words is easy. However, in our data set we are constantly adding new author strings. I wanted to avoid the additional complexity of managing and updating an author words table.

My Solution

I added the following indexes to the authors table:

create index authors_name_varchar_pattern on authors(lower(author_name) varchar_pattern_ops);
create index authors_name_varchar_pattern_1 on authors(lower(split_part(author_name, ' ', 1)) varchar_pattern_ops);
create index authors_name_varchar_pattern_2 on authors(lower(split_part(author_name, ' ', 2)) varchar_pattern_ops);
create index authors_name_varchar_pattern_3 on authors(lower(split_part(author_name, ' ', 3)) varchar_pattern_ops);

Then I changed the query to:

$term = $term . '%';
my $terms =
$c->dbis->query( "select authors_id, author_name as label from authors where lower(author_name) like lower(?) OR " .
" lower(split_part(author_name, ' ', 1)) like lower(?) OR " .
" lower(split_part(author_name, ' ', 2)) like lower(?) OR " .
" lower(split_part(author_name, ' ', 3)) like lower(?) LIMIT 100 ",
$term, $term, $term, $term )->hashes

The above query performs prefix searches on each of the first three words within the author name string. Additionally, it performs a prefix search on the entire author name string. We OR these prefix searches so that we return the results that match any of the prefix searches. The nice thing about this search is that the user can start typing either the first name or the last name (or the middle name). For example, when the user starts typing ‘joh’, they’ll get a list that includes the ‘john’ names. When the user starts typing ‘smi’, they’ll get a list that includes the ‘smith’ names. Because we also do a prefix search on the entire string, the user could start typing ‘john’ to get a list of names starting with ‘john’ and then expand that to ‘john sm’ to get a list of ‘john smith’s and similar names.

(Note: it might have been safe to omit lower(split_part(author_name, ' ', 1)) like lower(?) from the query since it will often be redundant with lower(author_name) like lower(?). We decided to leave it in place to enhance readability and out of concern that it may be necessary to handle improperly formatted author strings.)

Conclusion

This approach does have its weakness. It won’t work for names involving more than 3 words. Fortunately those names are rare. It would also have been nice to be able to do full wild card searches such as like ‘%bla%’. Unfortunately, PostgreSQL does not have any index to optimize those types of searches.

In an ideal world, the database would have a set of indexes and operators that would perfectly handle the search queries necessary for autocomplete and abstract them away from the programmer. However, I’ve presented a solution that has performed well for me and has added only a minimal amount of complexity to the code and the database schema. I’d love to hear comments or suggestions for alternate approaches.

How The Social Network Got It Wrong: Facemash and the Dangerous Propagation of the Myth of Network Scarcity

How The Social Network Got It Wrong:

Facemash and the Dangerous Propagation of the Myth of Network Scarcity

There have been many reviews of “The Social Network” decrying the movie for missing or distorting some essential aspect of the Facebook story. For example, there are negative reviews from Lessig and Jarvis both of which are worth reading. To some extent criticism is to be expected. Love it or hate it, Facebook is now too significant to ignore. Facebook represents an essential element of the zeitgeist. It’s the most successful business to emerge in the new millennium and hundreds of millions of people use it daily.

It is with some hesitation then that I have decided to add myself to the many grumbling reviewers. But what makes this review different is that instead of focusing on things that the film left out or themes that it should have included, I intend to focus on a specific and clearly documented and verifiable instance in which the film differs from real events: the technical effects of Facemash.com’s popularity on Harvard’s network. I will also explain why this deviation leads to the propagation of dangerous myths of scarcity.

The film relatively accurately depicts how late one night before he started Facebook Mark Zuckerburg decided to start Facemash. For those who haven’t seen the movie, Facemash presented the user with two randomly selected pictures of Harvard students and then let the user vote on which one was hotter. Zuckerburg really did start Facemash on a whim in his dorm room late at night and he obtained the content by programatically scrapping Harvard house websites. Yes the real site was just as tasteless as what was shown in the film and has all sorts of copyright and privacy problems. Both in reality and in the film, word of the site spread more quickly than Zuckerburg intended or expected. However, in The Social Network, the popularity of the site and the traffic it generates crashes the Harvard network. As both a computer scientist and a user of the Harvard network, it just didn’t seem plausible to me that the Harvard network would have crashed. (Even non-technical viewers might wonder about an apparent logical inconsistency. In the movie, Facebook becomes many times more popular within Harvard  than Facemash ever was but Facebook has no adverse effects on Harvard’s network.) I researched the issue extensively and found no evidence that the Facemash traffic had any significant effect on the University’s network. I therefore conclude that the crash of the Harvard network shown in the film is purely a dramatic fabrication.

The Crimson – the newspaper of record for Harvard College – is perhaps the best source of documentation on Facemash. A search through the Crimson archives shows that Facemash received much coverage through articles and editorials, most of which were negative. The social and moral implications of the site are extensively discussed. But in the many articles in the Crimson, I found no mention whatsoever that any adverse harm came to the Harvard network infrastructure from the Facemash fiasco. What I did find is 1 article in which Zuckerburg said that the unexpected traffic made his personal computer, on which Facemash was hosted, so slow that he had trouble logging onto it. Facemash was actually on-line for a little over two days before Zuckerburg semi-voluntarily shut it down in response to wide-spread criticism (see the following article).

Why this Matters

I know that this difference may seem nitpicky. Showing a network crash is much more entertaining than showing a slow server; so why shouldn’t the director be able to take some dramatic license? The Internet changes costs in ways that are difficult to conceptualize and many people are misinformed. We as a society are only going to make sensible policy about the Internet if we have some intuitive understanding of how it works. I don’t expect movies to inform an audience about policy discussions, but I expect them not to misinform people. The policy implications of the two scenarios are dramatically different. If the sudden popularity of a single web site is enough to cripple an entire network, the only logical response is to come up with some technical or legal mechanism to regulate or restrict access to the network. For example, students might be restricted from running web sites on their computers or there might be a mechanism in place to artificially limit traffic to student web sites. If on the other hand, the unexpected success of a web site merely affects the site owner’s computer, there is no cause for concern. It is the site owners’ responsibilities to optimize their sites or obtain more powerful servers and if they don’t no one else suffers. (It is still possible to argue for website restrictions for social rather than technical reasons – an approach followed by countries such as China and Iran.)

The above discussion is not purely hypothetical. Network owners often push the paradigm that network bandwidth is a scarce resource that must be rationed. This view was most famously advocated by Ted Stevens in his series of tubes speech and is frequently reiterated in one form or another. Examples include telecoms insistence that they need to be able to employ discriminatory network management policies and the recent data usage limitations on AT&T’s iPhone plans and Comcast’s residential Internet. The individual issues have been debated elsewhere and a thorough discussion of them is outside the scope of this blog. However, the fundamental point is that the paradigm of scarcity creates dangerous incentives for network operators. If network operators can profit from rationing their capacity, there is no incentive to increase capacity. For example, cell phone companies have historically charged ridiculous rates for SMS data and voice overages. As mentioned above, recently AT&T stopped offering unlimited data plans to new customers. The company apparently decided that it was more profitable to simply ration network access and impose usage charges rather than to upgrade their network.

Competition can help but once social expectations of scarcity are in place they tend to remain long after they are warranted. For example, when Hotmail started in 1996, it offered 2MB of storage per account which may have been a reasonable amount of storage given the cost of disk space. But by 2004, disk space was hundreds of times cheaper yet Hotmail still only offered 2MB of free storage. Hotmail was hardly unique, other email services such as Yahoo didn’t offer much more storage. Once a social expectation of scarcity was established, it remained in place long after the technical justification. The email market was competitive during this time, although there were high switching costs, there were also dozens of providers to choose from. However, this competition did not result in increased availability of email storage. Part of the problem may have been that Hotmail and other free services sold premium plans that included additional storage. Just as airlines are reluctant to improve coach for fear that 1st class will be comparatively less desirable, web mail providers may have been wary of improving their free offering for fear of losing sales of the premium plans. However, even the premium plans had surprising low storage. Finally in early 2004, Google launched Gmail and this myth of scarcity was shattered over night. Gmail initially offered 1GB of free storage – 500 times more than Hotmail’s free offering and many times more than Hotmail’s premium plan. Suddenly everyone’s expectations changed and within weeks other email services increased their storage offerings.

Free market purists might argue that this story is a vindication of the market since the problem was eventually corrected. But it is important to remember that competition among the existing firms wasn’t enough. A change in social expectations only came about because a new company was able to enter the market. Imagine a market in which there was less competition and higher barriers to entry. The cell phone market is marginally competitive while the broadband market is essentially a duopoly. Both markets have high barriers to entry. Furthermore imagine how much the world would have been improved if we hadn’t had to wait 8 years for this change to come.

It’s easy to laugh at the Ted Stevens’s of the world. Stevens can easily be seen as a corrupt Luddite who was either too dumb, too lazy, or too crotchety to understand the Internet but was too arrogant to know how clueless he was. (Though I disagree with Stevens’s series of tubes metaphor and some of his policy conclusions I acknowledge that it is possible to defend them; however his reasoning, examples, and misuse of terminology are not defensible and showed that he lacked any real understanding of the technology he felt confident enough to make policy on.) Criticism of Stevens is certainly warranted. He had a responsibility to make an informed decision about the issue and clearly failed in his duty as a public official. However, Stevens also serves as another cautionary tale. If someone whose job is to make and understand policy and who has a staff to assist him is nevertheless misinformed, how much chance do the rest of us have?

It is important that we as a society understand the Internet and its basic economics. I have attempted to push back against the paradigm of false scarcity that has been perpetuated by “The Social Network”. The wonderful thing about the Internet is that the cost of trying new ideas is low and consequently the risks of failure are minimal. I’m not defending the creation of Facemash – the site was clearly invasive of privacy, hurtful, and generally a bad idea. However, the conditions that gave rise to Facemash also enabled the creation of Facebook. The important thing was that after the Facemash debacle, Zuckerburg could easily start again with something new. On the whole, experimentation and innovation make the world better. I hope that those who saw The Social Network will be inspired to build their own websites rather than fear that their unexpected success will cause some type of network meltdown.

Why I left Windows Mobile: A Personal Perspective on Microsoft’s Mobile Phone Failure

Why I left Windows Mobile

A Personal Perspective on Microsoft’s Mobile Phone Failure

Microsoft’s Windows Mobile (WM) was an early leader in smart phone and mobile devices but has largely been eclipsed by other players. For the last two years, it has been all but irrelevant in the mobile phone space. The iPhone, Android, RIM, and Palm have all generated significantly more excitement in the press and user communities. Largely because of legacy installs, WM does more respectably when you look at the percentage of phones running each OS. However, it is clearly being overtaken by others as shown by new phone sales. In terms of mind share, the press has been more focused on iPhone, Android, and to a lesser extent RIM and Palm’s webOS. Among my friends and colleagues, I see a similar trend: the technoliterati and technophobes alike are ignoring WM as they happily flash their new iPhones and Androids. In October of this year, Microsoft released Windows 7 Mobile in an attempt to recapture its lead.

I have not used or studied Windows 7 Mobile. So I won’t try to evaluate it. (Note: If anyone wants to give me a Windows 7 phone, I’ll happily test it and review it in this blog.)  However, products as complicated as operating systems are often difficult to accurately evaluate before sufficient time has passed from their initial release. For example, although Windows Vista is now nearly universally considered junk, it initially received many positive reviews. So rather than talk about the Windows 7 Mobile, I hope to provide a personal historical perspective on why WM has been a failure so far. I’m going to describe my experiences as someone who used earlier versions of Windows Mobile for nearly 5 years and finally became disillusioned enough to leave the platform.

Note: This post is based on a series of notes that I took in late 2009 after I switched away from Windows Mobile. I intended to blog my experience then but various events intervened. I feel that the recent release of Windows Mobile 7 provides an interesting opportunity for a look back on prior versions and decided to update and finish this post. While flushing out and expanding my original notes, I’ve tried to maintain the flavor and freshness of the original where possible.

About myself

I’m a software professional and largely a Linux user. I used Windows Mobile for nearly 5 years and owned two separate phones. I still believe that at the time I bought the phones they were the best choices for me but they were clearly flawed devices. Those who defend Windows Mobile often point out that there are tweaks and work arounds for many of its shortcomings. I spent days if not weeks of my life fiddling with these phones. I read howardforum.com and xda-developers.  I tweaked registry settings and installed custom roms.  This certainly helped but at the end of the day, these phones were still deeply unsatisfactory.

Motorola MPx220: My first Windows Mobile Device

I bought the Motorola MPx220 in January of 2005. I paid $300+ before rebate.  Prior to that phone I had a simple Samsung monochrome phone.

Why I chose the phone

I wanted a phone that still felt like a phone. The MPx220 had a small form factor as a flip phone. Price was also a consideration. Touch screen PDA phones were around $600 in those days. The phone ran Windows Mobile 5 for Smart Phones(WM5).

Advantages

Compared to standard cell phones of the time, the MPx220 was relatively open.  It had a mini SD slot for storage expansion and could be connected to a computer through USB. This meant that I didn’t have to pay what I would call “sucker” fees.  For example, I still can’t believe people used to pay to get pictures off their phones. With the MPx220, it was possible to extract pictures simply by hooking it up to a computer through USB.  It was also possible to add ring tones simply by copying an mp3 or midi file into the appropriate directory.

Overall Opinion

The MPx220 showed me the promise of an open mobile computing device but utterly failed to achieve it.  The software included with the phone was frustratingly limited. For example, the voice recorder did not have a pause function – something that would have been trivial to add. The UI felt unnecessarily complex.  It seemed designed to mimic Windows XP rather than to provide the best user experience. Indeed the UI on my old Samsung felt much more intuitive. There were few third party applications available and they were pricey. Finding and installing apps was a hassle. Much to my annoyance, I found out that Windows Mobile and Windows Mobile for Smart Phone were different and incompatible OSs. Though I’m sure there was an acceptable technical justification for having two separate OSs, there is no justification for such confusingly similar names.

Other limitations resulted from the hardware.  Since the phone only had a number pad, entering text was a hassle. This effectively made it a read only phone. Also the camera wasn’t great (around 1.3 MP). The MPx220 could play music and there was even a third part application that could play OGG files.  However, because the phone didn’t have a standard head phone jack an adaptor was required. I occasionally used the MPx220 for pod casts but much to my disappointment it did not become an MP3 player replacement for me.

My next phone: the HTC Tilt

I used the MPx220 until 2008. My contract had expired but the phone worked and I didn’t find anything else that was super exciting so I hung onto the MPx220.  Then the phone started having problems.  The hinge for the flip top had come loose. I jerry-rigged the phone back together using a plastic pen cap and kept using it.  Then the screen stopped working.  It was finally time to get a new phone.  There had been news stories about the Openmoko phone. I was intrigued by it but a usable version still wasn’t available. A release date had been set for fall of 2007, then pushed back to December 2007, then pushed back again.  It was spring of 2008 and I needed a phone and Openmoko was starting to seem like vaporware.  The original iPhone had come out the previous summer but I decided against it.  At the time there were no third party apps on the iPhone. Since I had music in OGG format, this meant there was no way to play it. Also the 1st gen iPhone hardware seemed limited — it was only 2G and it didn’t have a keyboard.  The iPhone was also pricey $399 (http://blogs.law.harvard.edu/dlarochelle/2010/01/14/iphone-price-history/). So I went with the HTC Tilt also called the HTC Kaiser. Hardware-wise the Tilt was superior to the iPhone.  Among other things, it had 3G and GPS. Several of my coworkers had the phone and they were happy with it.  I was able to find a refurbished Tilt that was free after rebate with a two year contract.

My experience with the Tilt.

The Tilt ran Windows Mobile 6(WM6) and had a touch screen and slide out keyboard. The phone initially came pre-loaded with a bunch of crap-ware.  But early on I reflashed the phone with a custom rom from XDA developers.  This got rid of the pre-installed junk and greatly improved the interface.  The Tilt did a great job syncing with the Exchange server at work.

One annoyance was that interfacing with the phone required Microsoft products.  Rather than allowing the phone to appear as a mass storage device, you needed to use ActiveSync to access it. Not only did this mean that you needed to have ActiveSync installed, but that copying files to the device was somewhat convoluted.  Though the phone’s file system shows up in My Computer and Explorer, it wasn’t a real directory.  There is no drive letter and copying to and from the phone is slower than with a real file system.  Also even when the phone is attached to a computer, you can only copy files back and forth. There is no way to use the computer to directly edit a file stored on the phone. For example, suppose that you have a text file on your phone that you want to edit and you’d like to use a computer because, well, editing on a computer is easier. Instead of just connecting the phone to the computer, finding the file, and then opening and editing it, you would need to copy the file from the phone to the computer, make your edits, and then copy the file back.

Additionally, you need Outlook to sync your contacts and calendar. I happened to have Outlook because it was included with the MPx220.  But the Tilt did not include Outlook. Without Outlook the phone would have been nearly useless. Existing contacts would have needed to have been  re-entered into the phone manually — a tedious process even on a phone with a qwerty keyboard. Furthermore, Outlook was necessary to backup calendar and contact data.  With a smart phone backups are vital and not just because they will protect you in disaster. If you don’t have backups, you won’t trust the phone. If you don’t trust the phone, you won’t put your data on it. And the whole point of a smart phone is to have access to all your important data at anytime.

Email

Although the phone worked beautifully with Exchange, it was less useful for POP3 and IMAP. POP3 wasn’t designed for mobile devices so I don’t fault the phone. However, the IMAP support was inexcusably buggy. In the fall of 2008, I begun using IMAP for both home and work email. I noticed that while the phone could receive messages, it couldn’t send them. Sometimes, I could get around the problem by deleting and re-adding email accounts, but the problem would soon return. As you might imagine being able to receive messages but not send them was frustrating to put it mildly. (There were a few times when I actually used the phone’s browser to send very urgent emails but that was tedious.) Finally, Microsoft released a fix for the bug. I don’t know exactly how long the bug existed but it was at least months. This issue would have been obvious to anyone who used IMAP for a significant period of time. IMAP seems to have been viewed by Microsoft as a check box feature rather than something they actually cared about enough to get right. It also speaks very poorly of Microsoft’s QA practices that the bug was not caught prior to release and existed so long in the wild.

Other Thoughts

The Tilt was my first Internet connected mobile device and it definitely enhanced my life. In addition to email, it was nice being able to see the weather report and to Google something when I was out to lunch. Google Maps was a great feature and although it didn’t do turn-by-turn voice guided car navigation, it was great for getting basic directions or finding places on foot.

The Internet connection made installing apps easier since no computer was required but it was still a hassle. The process of finding apps using the phone was painful enough that you were probably better off just using a computer to at least find the application if not to download it. A number of vendors would actually text links that you could open on your phone to download the application. There was no app store so finding apps from the phone involved using the mobile version of Internet Explorer to Google for the app.

Microsoft makes some improvements

In mid-2009, Window Mobile 6.5 became available. The update helped somewhat but didn’t really address the limitations mentioned above.  Then in the fall of 2009 Microsoft finally announced its own app store. However, the app store while long overdue was fairly limited.

Google and Windows Mobile

Looking back it is ironic that Google of all people did the most to add value to the Windows Mobile experience. As mentioned above, Google Maps is one of the killer applications.  (In fairness the same could be said about the iPhone.) Another very useful app from Google was the YouTube player.

Windows Mobile’s email support for anything other than Microsoft Exchange is abysmal. Google eventually added sync support for Windows Mobile using the exchange protocol and suddenly users of Gmail were able to get significantly more value from their phones. (However, WM only supported a single Exchange account.  So if you needed to use Exchange for work email, you won’t be able to also take advantage of Gmail’s Exchange support.)

Android

I’d been curious about Android for a while. The Tilt is very similar to the G1. Both phones are made by HTC and have essentially the same form factor. Since Android is Open Source, people had been trying to port it to the Tilt since before Android was released. Their efforts reached some fruition in the fall of 2009 and I was able to get Android running. The result was usable but not stable.  It required a hard reset about once a day. Still it gave me a chance to try Android and I concluded that it was better than Windows Mobile.

Why I liked Android Better

The Android interface and UI just felt cleaner and more polished. Among other things, the Chrome browser on Android was better than IE on WM. The Gmail integration on Android was much better — in particular, archive worked. The app store on Android had more and better apps than WM.  Adding an app store to WM was a huge improvement, however the WM app store was new while Android’s had been around for over a year and the WM store just didn’t have the breadth of apps that Android did. Additionally, there were apps such as Google Voice apps that weren’t available on Windows mobile.

Android also didn’t require anything like ActiveSync. You simply connected the phone to the computer, and then you could mount the file system like a flash drive. Indeed, I’ve used the phone as a flash drive to exchange files. The nice thing is this will work on any computer not just those few that have ActiveSync installed. Imagine the different between saying to someone  “Just hook my phone up to your computer and then we’ll copy the file onto it” vs. “I need you to download and install ActiveSync. Then we’ll hook up the phone but we need to make sure the phone doesn’t sync with your Outlook. Then we can copy the files to the phone. After that you’ll probably want to uninstall ActiveSync since you won’t need it anymore.”

Dropping WM

After tasting Android, I didn’t want to go back to Windows Mobile. At the same time, the hacked port of Android running on the Tilt was not stable enough to rely on. I could have tried to tweak Android so that it would run better on the Tilt but I decided that I didn’t have the time. Instead, I sold the Tilt on Craigslist and used the money to partially purchase a used G1. The G1 cost more than I sold the Tilt for and I also had to pay to unlock the Tilt. So my out of pocket cost was around $50 but it was well worth it.

Since switching to Android, I haven’t looked back. Shortly after switching to Android, I wrote that although I had used Windows Mobile for nearly 5 years “the Windows Mobile is dead to me and is essentially dead in the market place.”  I have no idea whether the recent release of Windows 7 Mobile will somehow enable the OS to magically rise again like a phoenix (or a zombie).  However, I have attempted to convey my experience as a user who gave Windows Mobile a fair try but found it wanting. In so doing, I hope I have provided some insight into what Microsoft could have done differently to maintain its lead in the mobile space.

GTDinbox Review

GTDinbox review

Update: 18 June 2010 – GTDinbox was recently renamed ActiveInbox for legal reasons. There have also been new features added.  I’m not  going to rewrite this review but  check the GTDinbox/ActiveInbox home page for details. 

For the past two months, I’ve been using a task management tool called GTDinbox. I installed GTDinbox based on the recommendation as a colleague. Overall, I’ve been pleased with GTDinbox and plan to continue using it but it’s not a perfect tool. I installed GTDinbox based on the recommendation as a colleague. Overall, I’ve been pleased GTDinbox and plan to continue using it but I would not say that it’s a perfect tool.

GTDinbox is implemented as a FireFox add-on. Its rather ambitious goal is to turn “Gmail into a unique task manager to effectively manage your inbox, reduce email overload and maintain inbox zero.”  Unlike services such as RememberTheMilk which, has it’s own cloud backend, GTDinbox builds on top of Gmail. This is a somewhat unusual approach but Gmail has a number of features making it a good basis for a task manager.  Gmail offers free network accessible secure cloud storage that can be accessed through any web browser or less eloquently through any IMAP client.  Additionally, many tasks begin life as emails. Thus integration with Gmail means that these tasks don’t have to be reentered into a new system.

Once you download and install the GTDinbox FireFox add-on, GTDinbox modifies the Gmail UI.  When you view an email, 4 buttons appear at the top of it allowing you assign the email a status by marking it is as either Action, Waiting On, Some Day, or Finished.  You can also assign emails to a contexts  — e.g. “home”, “work”, etc.  –, projects, or references. GTDinbox adds a box to the side bar that allows you to view tasks by Project, Context, or Status. Once you have categorized the email, you can safety archive it to clear up your inbox and then use GTDinbox to track it.

Internally, GTDinbox works by assigning labels to emails.  Context labels start with C/ e.g. C/Home or C/Work. Project labels start with P/. And status labels start with S/.  While these labels have special meanings to GTDinbox they are also regular Gmail labels.  When viewing a message, the GTDinbox label will show up just like other labels. This approach is simple yet it works surprisingly well. I also really like the transparency that comes from using Gmail labels as the basis of the system. When you view active tasks within a context or project, GTDinbox simply displays a Gmail search page using the labels as search criteria.  For example, the following search string is used for the home context “label:c-home (label:s-action OR label:s-some-day OR label:s-waiting-on) -label:s-finished”.  This means that you have less reason to worry about losing your task data.  Even if GTDinbox stopped working, you could still access your tasks directly by viewing labels within Gmail. It also means that you can manipulate your tasks from any mail client.  Obviously accessing and manipulating your tasks this way is a lot less fun but at least it’s possible. For example, I will often manually move order shipment emails to the S/Waiting-on label using my smart phone.

Other features

Want to enter a task that doesn’t come from an email? GTDinbox adds a “Compose Personal” link. This is like the standard “Compose Mail” except that it adds the email directly to your inbox.  You can then label the email as a task just like an outside email. “Compose Personal” is also useful for sending notes to your smart phone.  Now instead of writing down an address or room number, I’ll often just use “Compose Personal” so I can access it by viewing email on my smart phone.

Downsides

Software Environment

GTDinbox requires that you use Gmail.  Most people use Gmail as at least one of their personal email accounts so this is not a huge burden. However, when tasks originate as emails to other accounts such as a work email, some of the convenience of GTDinbox is lost. You must forward the email to Gmail or enter it manually using “Compose Personal” — much less eloquent approaches.

GTDinbox is a FireFox add-on so you must use FireFox to access Gmail in order to use it. This may not seem like a big deal — everyone uses Firefox — but it means that you can’t use tools such as PRISM with GTDinbox. I’ve found PRISM to be a more streamlined way to access Gmail than Firefox, however I stopped using it in order to use GTDinbox. Additionally, you must have permissions on the computer you’re using to install add-ons into Firefox. This may be a problem if you frequently access your email from public computers in places such as school computer labs or Internet cafes. There, it is likely that you will either be unable to install the GTDinbox add-on because Firefox is locked down or you will have to reinstall it every time you access Gmail. For security reasons, I no longer access Gmail from untrusted computers so this has not been an issue for me.

Functional Limitations

Often, the same task involves multiple emails.  For example,  a single online order may involve 3 or more emails: order confirmation, payment confirmation, and shipment confirmation. I usually mark all emails related to an order as Waiting-On until I receive the item. Unfortunately GTDinbox provides no way to combine multiple emails into a single meta-task. Thus these three emails will show up as three unrelated tasks.  This limitation is compounded by the fact that you can’t manage tasks within the message list view.  You must click through to the full message to mark a task as finished or assign it to a project or context.

NO Tickler

Using a tickler file to defer tasks until a certain data and then process them in your inbox is one of the core pieces of GTD. My biggest complaint, about GTDinbox is that there is no way to tell it that you want to defer a task until a certain day and have it hide the task until that time. The closest thing that GTDinbox offers is marking a task as “Some Day” but “Some Day” tasks still show up in context and project views.

Emails Not Assigned a Status Are Untracked

This is one of the things that tripped me up when I first started using GTDinbox. GTDinbox will not track an email unless you must assign it a status — by marking it is as either Action, Waiting On, Some Day, or Finished. Assigning the email to a project or context is not sufficient. Once I became aware of this limitation, it wasn’t a big deal. Still it seems like something that could be fixed.

Can’t edit Email tasks.

GTDinbox is based on Gmail and thus inherits its limitations.  Gmail does not let you edit emails and for normal email this makes sense. However, if you enter a task using compose personal and want to add a note or correct something there is no way to do this. The best you can do is to reply to the email or mark the email task as finished and then create a new task with the changes.

Not maintained or endorsed by Google.

GTDinbox relies on editing the html of Gmail to add it’s enhancements. Although, Google does not do anything to overtly block GTDinbox, it does occasionally change Gmail. Sometimes these changes are enough to break GTDinbox.  When this happens, the creators of GTDinbox are usually quick to release a version that fixes things.  I’ve only been without a working GTDinbox briefly. To minimize down time, install GTDinbox from the Firefox add-on page rather than the GTDinbox site so you get auto-updated.

Conclusions

I’ve tried various methods of task management and so far GTDinbox is the best I’ve found. But it is not a perfect tool. Piggy backing off of another service means that it’s less polished. I get the sense that GTDinbox could be truly amazing if it was acquired by Google and baked into the fabric of Gmail.  Still there is much to be said for the simplicity and transparency of it’s design. It’s a great tool for taming your inbox and managing tasks. Some people swear by paper task lists or local text files.  But if you’re looking for a cloud based task management system, GTDinbox is well worth trying.

Iphone price history

While preparing for a panel at Arisia entitled Early Adopters and Not on why people choose to be early adopters, I was trying to find the price history of the Iphone.  Since I couldn’t find a web site that presented historical price data on the iPhone, I decided to put something together myself. I’ve decide share the table I created so other people won’t have to repeat my efforts.

One of my reasons for generally not being a super early adopter is that you price a huge premium for having the latest or great gadget when it’s first released.  And if you wait a little bit you can get the same or better gadget for a lot less. Because the iPhone price is entirely set by Apple, it makes an interesting case study on how much the price of technology drops over time. The official price of the iPhone periodically drops, as shown in the table below. But, there are no sales and a new iPhone is never sold for less than the official price.  (There are occasionally sales on the refurbished iPhones for example on black Friday the refurbished 3GS was sold for $50.00 instead of the usual $150.00.)  Finding historical street price data is harder than historical MSRP data.  For the iPhone both prices are the same.

The table showing the historical price is included below.  Note: I’m sharing this table because I couldn’t find anything like it online. There may be some errors or omissions.  For example, it’s possible that I didn’t include a price change or there is an iPhone model I don’t know about or some of the prices are wrong.  The table only includes new iPhones within the United States. In case you’re wondering I don’t own an iPhone (I decided to go with Android instead).

1st Gen 4GB 1st Gen 8GB 3G 3GS 16GB 3GS 32GB
29 June 2007 $499.00* $599* N/A N/A N/A
5 Sept 2007 Discontinued $399 N/A N/A N/A
June 2008 N/A N/A $199 N/A N/A
June 2009 N/A N/A $99 $199 $299

* Customers who purchased in 14-day period before the September 5, 2007 were eligible for a $200.00 “price protection” rebate. Other customers were eventually given a $100.00 credit for Apple purchases.

Sources:
http://en.wikipedia.org/wiki/IPhone
http://en.wikipedia.org/wiki/History_of_the_iPhone
http://online.wsj.com/article/SB124448050292994727.html
http://gizmodo.com/5015540/iphone-3gs-true-price-compared
http://www.intomobile.com/2009/06/08/apple-announces-iphone-3g-price-cut-to-99.html

Comments or corrections appreciated.

Using SSL to Prove Document Authenticity

This blog post is an idea that I’ve been kicking around for a while but haven’t had the time to research or implement.  I’ve finally decided just to post it speculatively.  I’m really hoping to get feed back from those in the community more knowledgeable about SSL than I am.  Note: This is a relatively geeky topic if you don’t understand what https:// and SSL are this post won’t make much sense…

Introduction

Does anyone know anything about the internals of https?  I was wondering if there is any way to prove that a document downloaded over https really came from the site you claim that it came from.  In other words, if you download a document over https, is there anyway for you to prove to a third party that it actually came from the web site you claim it came from? For example,  let’s say that Alice downloads doc.pdf from https://foobar.com/doc.pdf. https provides Alice assurance that doc.pdf really came from foobar.com (assuming that the certificate is legitimate).  But assuming doc.pdf does not have a digital signature,  if Alice simply sends the downloaded file to Bob, he has no proof that the file actually came from foobar.com. (Obviously, the ideal solution would be for the maintainer of foobar.com to digitially sign the pdf file. But few websites digitially sign the files they distribute and individual users often have no means of convincing a web site to do so.)  My question is whether there is any way for Alice to prove to Bob that she really obtained the file from foobar.com.  I thought that it might be possible for Alice to prove the file’s origin by sending some of the raw network traffic establishing the SSL connection along with the file.  (I’m using a PDF file to simplify the example but presumably the same issues would apply to a web page.)

Use Cases

PACER is an online service used by the United States federal courts to provide online access to court records and documents.  The documents on PACER are generally thought to be in the public domain but remain behind a pay wall.   Efforts such as  the PACER Recycling Project and RECAP allow users to upload PDF documents obtained from PACER to a central server where the documents can then be freely downloaded by others.  However, while PACER uses SSL, it does not provide digitally signed PDF files.  Thus users currently have no way to prove that the documents really came from PACER.

Another use case, is as a replacement for web screen shots.  Because web pages can be easily altered or taken down,  screen shots are often offered as “proof” that a web page used to exist even if it has since been altered or removed.  For example, this CNET news story describes how pranksters from 4chan retaliated against AT&T for blocking their site by posting a fake report saying that AT&T’s CEO died.  The story includes this screen shot of the pranked web page prior to its removal.  Of course screen shots can be easily faked or altered using tools such as Photo Shop or just by saving and editing the html.  Presumably web screen shots posted by CNET are relatively trustworthy, but what about screen shots posted by unknown users?

Ideal Solution

I envision a Firefox extension that would allow a user to easily create an archive bundle for an https: web page containing the page and SSL information proving its legitimacy.  (Obviously this would need to work for single files as well as web pages.)  This bundle would allow other users to view the web page of file as it existed and provide easily verifiable proof that the web page really came from the site in question.

My Questions for the SSL Knowledgable

Is this doable at all?

Screen shots are trivial to fake, if this approach can’t provide perfect proof of the origin of a document how much more assurance would it give you than just a screen shot?

Would releasing the raw https traffic also mean that Alice would be releasing her user name and password?

A minor concern is that the fact that a web site hosted or displayed a particular page is slightly different from the web site signing a file.  Furthermore, there may be issues with XSS vulnerabilities that allow attackers to make an https web site display arbitrary content.  However, XSS attack are a problem now with screen shot being passed around and XSS altered pages could probably be detected by viewing the html source.

But Not All Web Sites Use SSL

It has been repeatedly shown that web 2.0 applications such as gmail and facebook cannot be used securely over an unencrypted connection.  For example, hijacking the account of a facebook users on the same network is trivial. Perhaps I’m being overly optimistic but I believe once these vulnerabilities become more widely know and attack scripts/ exploits become widely available web applications will move to SSL as the default or at least offer https as an option.  (GMail already has an option to enable https though it is buried deeply within the settings.)

Please Comment

There you have it: my first real blog post.  Please let me know what you think.

Update December 13, 2009

Unfortunately, it appears that this won’t work.  The basic problem is that SSL uses a shared key so the client could easily forge messages.  (Initially, technically unsophisticated users might not be able to forge messages and sign them with the key but someone would probably develop an automated tool to do it.)  I still hope that at some point a standardized way to show what a web page showed previously will emerge that’s harder to forge than screen shots. Many thanks to Paco Hope and his colleagues at Cigital for providing feed back on this.