Why ‘gestures’ suck

I’ve not blogged in a while, and though I’ve said I’d try to make my blog less of a platform for public bitching and whining, I figure it’s Christmas, I should get to do what I want. So this is a blog post on why all ‘gestures’ in applications suck, ‘gestures’ are always a bad idea and if you’re implementing ‘gestures’ in your application, you’re doing it wrong. Of course, this is all my personal opinion and I’ve done only the most cursory amount of HCI study, so take it with a pitcher of salt.

Great user-interfaces are made great by working on a user’s familiarities. This makes a lot of sense. If someone designs an icon to represent an action, they find the nearest every-day analogy that has a clear and identifiable visual, and base it off of that. Mail icons involve envelopes, print icons involve printers, search icons involve magnifying glasses (ok, that last one relies pretty heavily on cultural knowledge which is probably questionable nowadays, but bear with me). This should follow on to all aspects of HCI. People will find things easier if they can apply a skill they already have, or they can relate it to something they’re already familiar with.

Touch-screens are becoming a much more common input-device these days, and they’re one I’ve been interested in for a very, very long time. Now that they’re becoming more common, more people are trying to retro-fit their applications to work better with this new interface. And this seems to be where ‘gestures’ come in. People see pinch-to-zoom, or dragging on the iPhone/Pad/Pod (and I’m just going to reference those, as as far as I’m concerned, they’re the only devices that have gotten touch-interaction close to being right), and they seem to think “Hey, that’s cool, I should put those actions in my application!” STOP.

I have a newsflash – and I’m sure this is just pointless ranting for a lot of people, but I’ll say it anyway – pinch-to-zoom and dragging are not ‘gestures’. They are physical manipulations that have a logical result. You don’t ‘execute a pinch-to-zoom gesture’ when you zoom in on a web-page or photo on an iPad. You put two fingers on the screen and you move them closer or further apart, because it makes physical sense. When you put your finger on the surface, it responds instantly and with minimal latency – it immediately establishes that placing your finger on this surface attaches your finger to that point on the surface. From there, pinch-to-zoom makes perfect sense and follows logically. These aren’t ‘gestures’, these are direct and logical manipulations of a surface. And that works. Having instant and reliable response to an action is a very powerful device.

If you’re a gestures fan, you may now be thinking “Well, the difference is academic, surely?” and I would disagree very strongly with that. A gesture, by definition, is when you make a movement to express an idea. With a gesture, it’s ok that you would do one thing, and then, afterwards, something happens. With a gesture, it’s ok that whatever gesture you make, what follows may not be directly linked with that gesture. And this is often the feeling you get when you use an application that has ‘gestures’. You make a gesture, and then, after the application has considered things, it does something. There is no guarantee that what you do will have an instant and well-defined reaction. And as long as we continue to call these actions ‘gestures’, this will always be ok, because this is the definition of a gesture. A gesture does not imply any kind of reaction, or make any implications about latency or reliability.

I bring this up now, as my Android phone (see, I’m not an Apple fanboy!) recently updated to the latest Android market, and this is a damn good example of bad HCI (and bad several other things too, but I want to focus my bitching). For those that have the application, open it up and check this out – There’s a carousel at the top of the application. You can drag this to scroll it, and when you release, it sort-of maintains your momentum and sets it spinning. Except there’s a problem (which is why I said sort-of) – When I drag it, there’s no relation between where my finger is and what’s under my finger. I’m not physically dragging the carousel, I’m performing a ‘drag gesture’. Similarly, when I perform a quick drag gesture and I let go, there’s a small pause, and then the carousel starts spinning with the momentum I gave it – except it isn’t the momentum I gave it, it’s a similar, but not quite right, momentum. The list at the bottom of the application is better (due to it being a stock scrolling widget I imagine), though not much, because they seem to do blocking I/O while you dragging, breaking the direct relation between your physical interaction and the on-screen response.

I don’t mean to pick on Android Market especially, as it’s something you can see in touch-based interfaces all over the place (Android is bad, but feature-phones are often far worse). But in my eyes, this sort of thing shouldn’t be acceptable. Apple proved that it isn’t that hard several years ago now – it’s not an innovation anymore, someone’s gone and done it – we can just copy them!

So, if you have an application that you expect to work on a touch-screen, or you’re planning on writing one, think first, “What physical analogy am I making here?” What common familiarity are you taking advantage of? And if your application involves taking advantage of the fact that most people are used to manipulating things with their hands, then do try to realise just how important making the feedback instant, reliable and logical are. Then realise that you must NOT call these physical interactions ‘gestures’.