Designing For The Google Assistant

As we move towards the age of conversational computing, basic daily tasks, whims and curiosities should become seamless to fulfill with natural language. The promise of conversational UI is that life can be simplified and friction removed from the interaction with technology. In this post, we’ll take a look at what designing for the Google Assistant (one of the most exciting conversational UIs on the market today) entails and the opportunities it brings.

The State of the Art

Hi, I’m Craig Pugsley – Creative Lead in the Product Research team here at Just Eat. As well as being a massive fan of everything beautiful, I’m also a huge tech freak and gadget fan. I love the innovation that’s been happening in the voice tech space in the last few years, so let’s start with a tiny introduction to the state of the art. Voice-based smart speakers have been around for years. Google brought the Google Home to UK shores back in April 2017 and its capabilities and hardware siblings have grown the Google Home family ever since. At CES 2018, Google also announced their new ‘smart display’ platform, powered by the Google Assistant, with hardware from a range of OEM manufacturers. Google was users to experience the value of the Google Assistant, that powers devices such as Google Home, so that a quick ‘Hey Google’ is all you need to say to get something done
Google call their platform ‘the Google Assistant’ and its showing up across almost every type of digital surface you can image – Android, iPhones, iPads, Nest Cams, Android Wear smart watches, Android Auto-powered in-cars systems, wireless headphones, etc… Creating an ‘Action’ or app for the Google Assistant may unlock the door for some of these surfaces – with varying degrees of visual UI – but with a strong core capability set you can build a compelling experience on.

Design for Differences

Designing for Conversational UIs can seem daunting at first, but many of the same principles and processes you’re used to using are still very appropriate. One of the biggest differences to consider is the UI bandwidth you have to engage with your user is considerably smaller. If you’re used to the luxury of a sea of smartphone pixels, relying on copy (and maybe a set of pre-defined visual widgets) may seem like a frustratingly limited toolset to craft a user experience around.
This is, however, entirely the wrong way to look at designing for CUIs. Instead, what you have is a perfect opportunity to strip back superfluousness and focus on delivering real user value in a way that gets your users’ jobs done potentially more quickly and efficiently than any other platform.
Considering conversation UI bandwidth – and designing explicitly for it – is crucial. Voice-only surfaces like Google Home present a very different UI bandwidth channel than, for example, talking to the Google Assistant via the smartphone app. With the Google Assistant you can augment context and decisions related to the conversation with elements shown on-screen. This makes conveying more complex information to your users so much easier, and you can introduce more sophisticated interactions when it comes to asking your users to make choices from a range of options.
Chatbots – and specifically apps for the Google Assistant – are the focus of this article.

Importance of Lean Experimentation

With the proliferation of new Assistant-connected surfaces and UI options, where do you begin to ensure a seamless user experience? The answer’s the same as for any platform! Design it user-centred, design it lean. Specifically, find where the user value lives. Here in the Just Eat Product Research team, we fundamentally believe in this lean, iterative, user-centred approach to product creation. For us, that means experimentation and a frank and honest approach to failure. Culturally, many of us struggle with experimentation, as it means we have to point at things we tried that didn’t work and justify the time we spent on them. How our managers, business owners, colleagues and peers react to our failures is deep-seated and cultural and will take years to adjust. But it’s only by having a progressive approach to failure and recognising the value of knowing where not to go that we can unlock true product innovation and discover real user value. The days of the HIPPO (Highest Paid Person in the Office) making the decisions have to end – building products based on our prejudice and assumptions isn’t great for anyone: it’s incoherent, wasteful and of low-value to users, and it’s expensive for businesses. Sailing a massive tanker in the wrong direction, then trying to turn it when you realise you’re off-course is slow, arduous, deeply unmotivating and unproductive.
The key to embracing failure and using experimentation to unlock innovation is asking the right questions early on: find the jobs that people are really trying to get done, then try as many solutions to these jobs as quickly as you can. And by ‘try’ I mean putting something in front of your real users that’s as close to the real thing as possible, in a setting that’s most natural to them and how they’re going to use your experience. When you find something that resonates with your users, that’s when you double-down.

Find The Key Use-Cases

Now, the reason I’m ranting on the art of product creation is because the ‘designing the right thing’ part of the classic Design Council ‘double diamond’ is absolutely crucial to building a great conversational UI. Finding the one or two core use cases from your native app that suit a conversation UI will mean your users are happier, served better and mean you can focus on designing a few things really well. Voice and chat interfaces force you to re-think your flows. No longer can you display 20 pages of SERP results and ask your users to move through your 4-step funnel to goal completion. While you’re defining product scope, you need to continually ask yourself “why would they use this, rather than the native app?”. It’s only when you (and, more importantly, your users) can answer that question definitively that you know you’ve struck chatbot gold.

Getting Started With Design

OK, so you’ve used Design Thinking or some other lean method to identify the core use-cases you’re going to support. What next? Script-writing and rapid prototyping, that’s what! Designing a conversational flow should be done in three phases: initial script & roleplaying to figure out the structure, 2) prototyping & fitting to interaction methods to make sure it works as a chatbot and, 3) final tweaking to add polish & delight.

Phase 1: Rough Script & Roleplaying

To get started with phase one, you need to quickly write the conversation out, like you were writing a screenplay with your conversational app and user as the actors. Don’t get too hung up on the exact words or options you’re giving to your users just yet. You won’t get it right first time, and you’ll be doing lots of iteration. Just make a start. Here are some tips for writing your first conversation script:

Become familiar with the seven core building blocks of the Google Assistant conversation. You can present simple unformatted text, cards with formatted text and images, animated gifs, buttons with onward action suggestions, lists, etc… get to know these components, what they offer and their limitations. Then you’d know the right one to use for each response. Don’t worry too much about this while writing your first script, just focus on the words, But just be aware that copy isn’t the only way to communicate in the Google Assistant. Maybe jot down some notes in the margin as you’re writing the script.
Keep your responses short and conversational. Use contractions where possible (“won’t” rather than “will not”) and use slang and colloquialisms if it fits your brand.
Find your organisation’s tone of voice guidelines and read them. Keep their guideline do’s and don’t in mind while you’re choosing your words. In your copy, try to speak in the voice of your Brand.
Make sure you end each response with a question. This gives the user a clear understanding of what they need to do next, and also ensures the user understands precisely when the microphone is listening.

Now you need to roleplay your script with a real human being! Find one, then get them to be your user, while you be your Google Assistant app. This will be weird. Don’t worry. Swallow that embarrassment, brace yourself and become your app! Speak the script as you’ve written it to hear what it sounds like. With this first pass, you’ll notice loads of words or phrases that could be tweaked, sound weird and robotic when you say them or just plain don’t work. Make adjustments and notes about what works as you test each turn of the scripted conversation. If you’re making lots of structural changes, find another human victim to test on so you get a fresh pair of ears 😉

Phase 2: Prototyping & User Testing

With your tweaked script you now need to move straight to phase two and create a prototype to allow you to test with real users. Creating a conversation in the Google Assistant is super-easy using Google’s own web-based WYSIWYG (what you say is what you get!) tooling called DialogFlow. I won’t explain how to use that here, as there are plenty of really great tutorials on the web. Basically, each state your app is in is called an ‘intent’. These intents are usually triggered by your user saying something that matches what your intent is listening for. When an intent is triggered, your app will respond with the copy you’ve written, you’ll ask a question, and the user will say something that triggers another intent. This goes on until they hit a ‘final’ intent and the chat closes. When you move to production and build your Google Assistant Action for real, Dialogflow will hand off to some code stored in the cloud that your engineers will write. The code will decide how to reply, and replace the copy you’ve put in your Dialogflow intent with whatever the real code says it should do.
Once you’ve got your prototype working, run through the flow using both touch, type and voice. Switch between modes (start with voice, move to touch, finish with typing) to test how that feels. Are you asking the right question at the end of a response? If the user taps to make a selection and interrupts your spoken response, is it still obvious what they should do next?
You can edit your written copy and spoken copy separately. So you could, for example, show more detail on screen than you actually say. You don’t know how your users will be interacting with your Action, but you need to design for voice-only as much as voice, touch and typing. Hide the screen and try your flow using only your voice. Does it make sense?
Now test with real users! This should be done on the real Google Assistant platform. As you’re user testing, try to spend time probing your users’ mental models. User expectations of what your bot can do and the realities of your use-case might be very different. Pay close attention to how you’ve onboarded your users and set the scene, probe their mental model of what your app can do. Make notes about the bits of your script that make people smile or frown – don’t forget, the bandwidth of this channel is way more limited than your apps or websites. You need to work really hard to squeeze a positive emotional response out of your users, but that’s what you should be aiming for! Find areas that feel emotionally flat as you’re testing, make notes of them and make sure you focus on those in the final phase.

Phase 3: Dialing It In

You made it! Your final phase now consists of fit & finish, final polish and amplifying delight. This means really spending time tweaking those words: does the response really need to be as long? Can I say the same thing in half the words? What’s a more exciting way of phrasing that? Could I use an emoji instead? Would an animated gif get the message across? It’s this phase where you can really exploit this new medium’s opportunities to create something delightful!
To finish, here are some top tips to remember when designing for the Google Assistant:

Be your brand. Express it in as few words possible.
The Google Assistant supports animated gifs, sound effects, SSML (style markup for your copy). Exploit these features!
Know your Google Assistant widgets. Especially which ones can be used together. Always try to use suggestion chips & carousels, in case the user wants tap rather than typing or speaking. You can only respond with max 2 items.
You can do a lot in DialogFlow, but it quickly becomes limited when you need logic / external data. However, this does create a nice division of labour between you and your engineers.
Make sure you ending every response with a question
People move between interaction modes (start with voice, may move to touch for selection). It’s very import you design for all modes at all points of your flow.
Be very clear about the use-cases that suit conversation UIs. Have honest conversations with your Product people.
Keep responses short – remember why people are using your chatbot over your app or website.
Try to user test in context (not in the lab, if you can help it). Even if you take the participant out for a walk!
Be sensitive to how the Google Assistant works on all platforms (Mobile, Home, Android Wear, Android Auto, etc…).
You can specify the first response in Dialogflow, then hand off to your engineer’s code for the next response. You can provide a load of first responses, and Dialogflow will pick one at random. This is great for making your bot feel more natural.
Where your responses are generated in code, try to get some variation into your responses. This will be difficult to argue for when you’re agreeing project scope, but it’ll be worth it for how much more natural your conversational app feels.

Have a go at designing for the Google Assistant and let us know how you get on in the comments below!