Lately, everyone’s talking about “conversational UI.” It’s the next big thing. But the more articles I read on the topic, the more annoyed I get. It’s taken me so long to figure out why!
Conversations, writes WIRED, can do things traditional GUIs can’t. Matt Hartman equates the surge in text-driven apps as a kind of “hidden homescreen”. TechCrunch says “forget apps, now bots take over”. The creator of Fin thinks it’s a new paradigm all apps will move to. Dharmesh Shah wonders whether the rise of conversational UI will be the downfall of designers. Design, says Emmet Connolly at Intercom is a conversation.
Benedict Evans prophecized that the new lay of the land is “all messaging expands until it includes software.”
“People don’t want apps for every single business that you interact with,” says David Marcus, head of Facebook Messenger, “…just have a message within a nicely designed bubble … [that’s a] much nicer experience than an app.” Under his charge, Facebook Messenger has tested this approach, building integrations with high profile partners as well as opening up a bot API.
We’ve even seen avant-garde attempts at taking this idea to its extreme, like Quartz’s latest app, which presents the news as a conversation, or the game Lifeline. Apps like Mailtime even promise to save us from our emails by turning them into chats.
I guess I might be partially to blame for this, with a few pieces citing a section in a 2014 piece of mine that I literally titled “Chats as Universal UI.”
This recent “bot-mania” is at the confluence of two separate trends. One is agent AIs steadily getting better, as evidenced by Siri and Alexa being things people actually use rather than gimmicks. The other is that the the US somehow still hasn’t got a dominant messaging app and Silicon Valley is trying to learn from the success of Asian messenger apps. This involves a peculiar fixation on how these apps, particularly WeChat, incorporate all sorts of functionality seemingly unrelated to messaging. They come away surprised by just how many differently-shaped pegs fit into this seemingly oddly-shaped hole. The thesis, then, is that users will engage more frequently, deeply, and efficiently with third-party services if they’re presented in a conversational UI instead of a separate native app.
It’s that part which, having spent the past two years in my current job eating and breathing messaging, seems a major misattribution of what makes chat apps work and what problems they’re best at solving.
As I’ll explain, messenger apps’ apparent success in fulfilling such a surprising array of tasks does not owe to the triumph of “conversational UI.” What they’ve achieved can be much more instructively framed as an adept exploitation of Silicon Valley phone OS makers’ growing failure to fully serve users’ needs, particularly in other parts of the world. Chat apps have responded by evolving into “meta-platforms.” Many of the platform-like aspects they’ve taken on to plaster over gaps in the OS actually have little to do with the core chat functionality. Not only is “conversational UI” a red herring, but as we look more closely, we’ll even see places where conversational UI has breached its limits and broken down.
But first, let’s retrace how this state of affairs really came about in the first place.
Note: The opinions expressed here are purely my own and do not reflect that of my employer.
A BRIEF HISTORY OF THE CHAT BUBBLE
We’ll begin by taking a closer look at the apparent atomic unit of the “conversational UI”, our friend the message bubble. To do that, we’re going to go back in time a bit. Let’s take a stroll to, oh, about 2003.
In those days, sending a quick text meant dealing with a UI that looked like this:
In many phone’s UIs, SMSes were treated like mini-emails, often complete with an inbox, outbox, and drafts. So fussy!
Later, some time in the last decade, perhaps owing to a prototype by Jens Alfke, our IMs began taking on their familiar appearance as cartoon dialog bubbles. When smartphones took off later, it was a natural fit for the system SMS apps on the first versions of iOS and Android.
Soon after smartphones launched, those default SMS apps were eclipsed instantly by third-party messaging apps emerging in Europe and Asia (in the US, we have somehow still clung to SMS). They had started as direct clones of the system SMS apps — the only difference being that messages were counted against one’s data quota instead of the stingy and arbitrary SMS allotment given by carriers.
These apps that came along initially to replace SMS have styled the message bubble every way imaginable: round and square, flat and puffy, green and blue. Free from the constraints of a 20-year-old protocol, these apps evolved, taking on more features. The bubbles displayed in these apps developed a number of affordances for new features like read receipts, names in group chats, and more. New kinds of bubbles emerged to accommodate new types of content these apps supported:
The app I’ve been working on really takes the cake for this. WeChat’s got bubbles for text, voice messages, big videos, l’il “Sight” videos, full-width cards with hero shots for news headlines, bubbles for payments, files, links, locations, and contact cards. Mucking through some code once, I saw definitions for nearly 100 types of supported messages, most I’d never seen in actual use.
Aside from supporting so many different types of messages, another advance WeChat made was realizing a messaging app needed different types of accounts as well. They’d seen brands and celebrities registering personal accounts and making series of giant group chats to invite their fans into. There had to be a better way! Thus was born Official Accounts.
Here’s what one of the first accounts, China Southern Airlines, looked like when the feature launched in 2012:
Yeah…this bot ain’t exactly HAL 9000.
Here’s what the account for my city’s subway system looked like:
Why was the user asked to enter numbers, as if on an IVR system? Were the creators of these accounts so unimaginative to the possibilities of a new medium as to replicate their old-school hotline?
Actually, no! In fact, keywords could be defined, and messages could be even routed through the third party’s server to formulate a response using whatever method it pleases. Yet in this case, entering keywords or more complex queries in Chinese (or god forbid, formulating a complete sentence) would be even worse. At the time, typing in numbers really was the best UI choice given the constraints.
Critically, these experiences were still often preferable to downloading a separate app on a data plan or spotty WiFi connection, or having to call someone’s customer service hotline and wait on hold. The Official Account platform was a rousing success; there are over 8 million of these accounts today. As it took off, the APIs offered to third parties to build their accounts expanded to accommodate a growing array of use cases and demands.
Some of these new APIs deepened and enabled new possibilities within the “conversational” nature of these interactions. Voice messages were transcribed via speech recognition before being sent to the OA’s server. Objects could be recognized in pictures. Advanced natural language processing could even extract named entities and certain types of queries from text sent by users. Users could be patched in to agents at service centers to carry on a conversation exactly as they would with a friend in the app. There was even a special integration whereby I can select a message in a chat and forward it to Evernote’s Official Account (as I would to a friend) to save it to a note. Cute, right?
On the other hand, far greater and more successful were the enhancements made running counter or orthogonal to the idea of conversational UI.
One affordance added right off the bat was the three-tabbed fixed menu. Now accounts could offer fast access to all their features without having to send a prompt or depend on state information. Here’s what the menu looks like today on the Guangzhou Metro’s main official account: