Using Design Sprints for Data-Informed Design Improvements

Kat
Oct 23, 2018
6 min read

Updated: Oct 30, 2018

home wireless troubleshooting phone app informs user they can connect to the faster 5Ghz network

Background Problem

A phone app to “troubleshoot your home wireless system” needs careful upfront definition, prioritization and scoping to handle the amount and complexity of use cases. How will we handle the number and kinds of devices in the household? Do we address how the size of the home, and materials - particularly metal and water - could interfere with reception? Do we address software and firmware issues? Did the home wireless user simply make an error - for example, mistyping a URL into the address bar - and then open our app, mistakenly attributing the “website not found” message to a home wireless issue?

It takes massive resources to design for all possible use cases, which is why usability experts often encounter content gaps in content-rich apps such as this one. Users expect relevant content for their precise situation, and when that fails it can be jarring. A common example of this failure in a different subject area is a system returning “no results found” after someone types a very specific query into a search bar, because no one wrote content that directly relates to the query.

search engine wireframe with no results found - an explanation of content gaps in content-rich applications — I guess no one wrote an article about that problem

In our home wireless troubleshooting case, we knew we had gaps in content because we could not detect all issues a user might call a home wireless issue. Additionally, even if the issue was actually just user error, simply throwing up a generic error message like “no wireless problems detected” upon app open would be an unsatisfying flow for users who genuinely thought there was a problem with their wireless systems.

cell phone wireframes - left phone has a large number of wireless issues, right phone has none — Which side did we want to be on?

Yet, we knew we did not have the resources to provide extremely specific troubleshooting for all possible situations. How could we optimize the design for the main use cases, such as wireless interference, and still provide a safety net for edge cases, such as a phone app crashing or a forgotten email password?

Initial Proposal

We took inspiration from empty-state designs that add a call to action. For example, this empty news feed displays a relevant call-to-action in the empty-state:

cell phone wireframe with an empty newsfeed that suggests a call to action - find people to follow

Similarly, in the edge cases where the app does not detect wireless issues, we could still suggest default troubleshooting actions. We chose to create a user flow called “Report an Issue” that creates helpdesk tickets. These tickets combined automated connectivity tests, launched during the Report an Issue flow, and user input via questionnaire. We could then (1) have a support agent provide timely help for that ticket and (2) later analyze aggregate user input and home wireless data to improve our product. (We could, for example, target the most commonly reported and important issues for a more prominent space in the UI, or for an earlier rollout date in our feature roadmap.)

We completed this project within an adapted Google Ventures design sprint. After jointly brainstorming and agreeing upon goals and requirements, the phone app designer laid out a preliminary flow and UI. I then conducted user research to iterate on the design.

wifi phone app wireframes - report an issue feature. Helps users input the wifi devices and activities that are experiencing issues

Research Questions

Our initial goals of making the flow worthwhile to users and well-designed for data collection gave me two related lines of inquiry. (1) Will users find it easy and worthwhile to answer the questions? and (2) Will they understand and provide useful answers to these questions?

To stay within my budget and timeline, I chose a hybrid qualitative and quantitative approach - a 3-person remote user test and a 200 person survey. I ran a think-aloud walk-through of the Report an Issue flow on three people to get a fuller picture of the “why” behind the survey answers. At the same time, I invited a few hundred people from Mechanical Turk to walk through the Report an Issue questionnaire as a survey through Google Forms.

(1) Will users find it easy and worthwhile to answer the questions? Do we have any basic gaps in usability and usefulness of the flow? For example, confusing wording and icons, and incorrect placement of buttons can interfere with otherwise well thought out user flows. Additionally, double-checking our work with fresh eyes lets us know if we forgot to include basic information or if we left out common answers to multiple choice questions. These oversights can frustrate users or even block them from completing their goal.

Results & Recommendations:

wireframes with multiple choice device selection before and after survey results showed the most frequently chosen wifi-enabled devices

I moved the devices and activities most commonly chosen in the Google Form survey to the top of their respective screens for easier access

wireframes before and after survey results showed people selected multiple wireless enabled devices

I changed responses on “what activities are affected” and “what devices are affected” from single choice to multiple choice after seeing that a large number of people would choose multiple answers

wireframes, user testing quotes, and survey results showing the decision to replace categorical emoticons with emoticons along a spectrum

I changed the emoticons from categorical choices (e.g. sad, tired, injured) to a spectrum from sad to happy. People only chose 3 of the original emoticons in the Google Form survey, and the think-aloud user testers said they loved the emoticon screen, but didn’t understand what each face represented. So, some emoticons were wasting screen real estate, and interval data is easier for both users and ourselves to interpret.
I added a phrase on the confirmation page with the estimated time to resolution, as requested by think aloud user testers.

(2) Will users understand and provide useful answers to these questions? Before launching a full scale data collection feature, we needed to evaluate the usefulness and inclusiveness of pilot survey data. For example, if everyone misinterprets a certain question, or if the correct answer to a multiple choice question is not provided as an option, then our dataset will be flawed and no amount of post-processing can fix that.

diagram showing user input in the phone app builds a corpus for categorization learning of wifi issues, and routing of issues to the proper helpdesk — Ideal Use of Data Sources

A robust helpdesk ticket dataset - with descriptions of people’s issues and associated final resolutions - would help us tune our ticket categorization model, validate our machine learning problem detection algorithm, and associate natural-language labels such as “spotty wireless” with specific data signatures in our app’s connectivity test results. Ultimately, a rich enough dataset would also allow us to categorize issues and send tickets to the correct company’s most appropriate helpdesk division, drastically increasing efficiency and ease of support. (According to customer service representatives, this addresses a major pain point: customers calling in about issues outside the scope of support.)

A quick analysis of the pilot survey’s free responses showed that the most common 1-, 2- and 3-word phrases involved restarting the router. In example responses, wording and subject matter in sections that answered “what is the problem” varied far more than in “how did you solve the problem” sections. The word distribution’s long tail of less common word phrases addressed a high number of unique problems, while the bulk of the distribution addressed a small number of troubleshooting actions.

Results & Recommendations:

diagram with wireframes and survey results shows that troubleshooting words were most frequent while problem descriptions words were less frequent. Splits wireframes from one into two screens.

Explicitly listing the few, predictable troubleshooting actions as multiple choice selections in the UI saves individual users time and effort typing, particularly on a phone. It also makes our analysis more easily interpretable. So, I suggested that we separate out a multiple choice “what common troubleshooting options have you already tried” checklist from the free response problem description. Specifically, two separate screens, entitled “what seems to be the problem” and “which of the following steps have you taken so far,” would replace the original free response prompt. The multiple choice checklist would display the most common actions that home wireless users do before calling in - according to both support agents and home wireless users. We had this information from the survey data, combined with knowledge from previous interviews with support agents.

Aside from saving users time and effort, separating the troubleshooting from the free response also has an effect on our collected data. Ridding ourselves of the troubleshooting words reduces noise in the problem categorization data, because words about troubleshooting (e.g. “restart the router”) appear to be unrelated to those needed to categorize the problem (e.g. “firmware out of date,” “wireless interference,” or “not enough bandwidth”).

One last note: rather than add a new screen, another option would have been to only include a description of the problem and get rid of the troubleshooting section entirely. However, based upon my interviews with support agents and home wireless users, users only know of a few troubleshooting strategies to try, and they commonly get frustrated when they have to repeat their troubleshooting with an agent. We continue to collect troubleshooting information to address that pain point.

Conclusions

This quick analysis provided useful insights and easy-to-implement design changes that fit well into the time and monetary constraints of our design sprint. It also laid the foundation to later expand data collection and analysis capabilities. We added the changes to the flow in the demo build of the app, and the benefit provided by the feature resonated with our customers' experience of pain points in their own customer support call centers.

The design team often chose this design sprint rhythm to efficiently work through our new features. Qualitative analysis of the 'think-aloud' walk-throughs was generally relatively straightforward. We could easily source our potential users (people who use wifi at home). However, we did not always have access to our secondary users (customer service representatives) during our design sprints. So, previous context of use interviews with customer service representatives gave a fuller picture to primary users' comments, and influenced my design recommendations as well. So, my research insights synthesized what I had learned during any given design sprint with previous work.

I was initially skeptical of attempting any quantitative analysis within such a short time period, because one could finish only the most basic of analyses - I was afraid I'd miss something important! However, that constraint strengthened my will to understand and search for only the largest of effects in the data. If the data doesn't hit you over the head with an effect, it's not important enough to worry about during the first iteration of a feature. Admittedly, this may not be the correct mindset for a larger company, but a scrappy start-up needs to go after the biggest uncertainties rather than getting lost in the weeds of the (beautiful, beautiful) data analysis.

Kat Snyder | UX Research | Palo Alto, CA

Using Design Sprints for Data-Informed Design Improvements

Background Problem

Initial Proposal

Research Questions

Conclusions

Recent Posts

Comments