CAROL CHENG
WORKABOUTRESUME

Voice Search Accessibility in VUI

Voice Assistants and conversational access to answers found in web tables

VUI RESEARCH

CASE STUDY

* This work is submitted to CHI 2019 Late Breaking Work

ROLE: Researcher
DURATION: 3 Months | Oct 2018 - Dec 2018
TEAM: Maggie Chan, Shitian Wang, Zining Ye, Helen Hu
ADVISOR: Prof. John Zimmerman

CHALLENGE

It is difficult for blind users to access information placed in web tables

People who need a screen reader to browse the web often have a lot of trouble accessing information placed in tables. Because of this, lots of information is very hard for them to access. Things like the price of a pair of shoes or the time a movie starts can take a long time to access. (What is web table?)

Voice Assistants provide a new method for asking questions where the answers can be found in tables on the web. This project hopes to understand how well current voice assistants perform and how possibly to make the web more accessible to blind users by allowing conversational access.

RESEARCH - PHASE 1

How good current voice assistants are at answering questions?

Before diving into any design and building a technical solution, we decided to first understand the current situation.

Previously, we collected approximately 2500 questions web users asks where the answer can be found in a web table from Amazon’s Turk. We then divided up the dataset across five researchers and curated a set of 500 questions. Each of us selected and tested 100 questions using Apple’s Siri, Microsoft’s Cortana, Amazon’s Alexa, and Google’s Google Home.

While testing questions via Google Home, I also searched on Google to see if there is a template existed, if answers are included in the template, and how many questions that Google Home could not answer well but are included in the template.

WHAT WORKS

  • Google Home performed the best, answering 73% of the questions. Siri performed the worst, but still managed to answer almost 50% of the questions.
  • We found that only 10% out of the 500 tested questions were questions that Google Home could not answer well but a template answer exists.
  • We compared the performance of each individual researcher’s set of 100 questions and confirmed that the systems were performing about the same.
(Results of performance of current voice assistants)

WHAT DOESN'T

Although we considered the results above as fairly good, we have identified several pitfalls in current voice assistants:

Besides the problems we found in terms of voice interaction itself, we wanted to know more than just the numbers. At this stage, we understood the performance of current voice assistants in answering questions, but what type of questions?

RESEARCH - PHASE 2

What type of questions do voice assistants perform the best and worst?

1. To gain insight into the kinds of questions the voice assistants could and could not answer, we first categorized questions by categories using an affinity diagram and evaluating the scores of how well the questions were answered.

2. We then categorized questions by question types using an affinity diagram and evaluating the scores of how well the questions were answered.

FINDINGS

01 | By Categories

Based on our affinity diagram that categorized by categories, we noticed that questions related to finances, literature, and time/date performed well by most voice assistants while questions relating to policy/law and tech performed poorly.

02 | By Types

Based on our affinity diagram that categorized by questions types, we observed that “What” single criteria questions and “When” questions performed well, while “How” questions and multi-criteria “What” questions performed poorly.

FUTURE WORK

The ability of voice assistants to answer questions where the answer is placed in a web table was better than we expected. However, we do see several opportunities for voice assistants to improve.

Provide overview of the information collected

Voice assistants do not solve the problem of accessibility to web tables for people who use screen readers. They can really help when users know what they want. However, they do not help when people need to browse to gain an overview of the information provided. There is still a need for more innovation on this persistent challenge.

Enable to answer multiple criteria questions

The ability to answer multiple criteria questions would help a lot for visually impaired users. This functionality could be done either by allowing users to ask complex questions (e.g. What kinds of jeans does American Eagle have? What is a good movie in theater for Thanksgiving? ).

Understand comparison terms

Voice assistants could benefit screen reader users more if they had a basic understanding of comparison terms such as cheapest or fastest, and if they had a basic understanding of common entities found in tabled information, such as prices, sizes, and days and times.

REFLECTION

Creating new technology is not always the solution

Before participating in this project, I had a misconception that current voice assistants are not function well enough, simply judging based on my daily interaction with my personal voice assistant. Whenever I involved in a voice technology related project, I always tried to develop something new and never thought of investigating and comparing the performance across all the current voice assistants. One of the most important things I have learned from this project is that developing new technology is not always the solution when the current state of the art is “good enough” to fulfill user’s needs. This doesn’t mean that we should be satisfied with where we at, but we should focus on what exactly is the problem we are solving before jumping into any new technical solutions.