DURATION: 3 Months | Oct 2018 - Dec 2018
TEAM: Maggie Chan, Shitian Wang, Zining Ye, Helen Hu
ADVISOR: Prof. John Zimmerman
People who need a screen reader to browse the web often have a lot of trouble accessing information placed in tables. Because of this, lots of information is very hard for them to access. Things like the price of a pair of shoes or the time a movie starts can take a long time to access. (What is web table?)
Voice Assistants provide a new method for asking questions where the answers can be found in tables on the web. This project hopes to understand how well current voice assistants perform and how possibly to make the web more accessible to blind users by allowing conversational access.
Before diving into any design and building a technical solution, we decided to first understand the current situation.
Previously, we collected approximately 2500 questions web users asks where the answer can be found in a web table from Amazon’s Turk. We then divided up the dataset across five researchers and curated a set of 500 questions. Each of us selected and tested 100 questions using Apple’s Siri, Microsoft’s Cortana, Amazon’s Alexa, and Google’s Google Home.
While testing questions via Google Home, I also searched on Google to see if there is a template existed, if answers are included in the template, and how many questions that Google Home could not answer well but are included in the template.
Although we considered the results above as fairly good, we have identified several pitfalls in current voice assistants:
Besides the problems we found in terms of voice interaction itself, we wanted to know more than just the numbers. At this stage, we understood the performance of current voice assistants in answering questions, but what type of questions?
1. To gain insight into the kinds of questions the voice assistants could and could not answer, we first categorized questions by categories using an affinity diagram and evaluating the scores of how well the questions were answered.
2. We then categorized questions by question types using an affinity diagram and evaluating the scores of how well the questions were answered.
01 | By Categories
Based on our affinity diagram that categorized by categories, we noticed that questions related to finances, literature, and time/date performed well by most voice assistants while questions relating to policy/law and tech performed poorly.
02 | By Types
Based on our affinity diagram that categorized by questions types, we observed that “What” single criteria questions and “When” questions performed well, while “How” questions and multi-criteria “What” questions performed poorly.
The ability of voice assistants to answer questions where the answer is placed in a web table was better than we expected. However, we do see several opportunities for voice assistants to improve.
Voice assistants do not solve the problem of accessibility to web tables for people who use screen readers. They can really help when users know what they want. However, they do not help when people need to browse to gain an overview of the information provided. There is still a need for more innovation on this persistent challenge.
The ability to answer multiple criteria questions would help a lot for visually impaired users. This functionality could be done either by allowing users to ask complex questions (e.g. What kinds of jeans does American Eagle have? What is a good movie in theater for Thanksgiving? ).
Voice assistants could benefit screen reader users more if they had a basic understanding of comparison terms such as cheapest or fastest, and if they had a basic understanding of common entities found in tabled information, such as prices, sizes, and days and times.
Before participating in this project, I had a misconception that current voice assistants are not function well enough, simply judging based on my daily interaction with my personal voice assistant. Whenever I involved in a voice technology related project, I always tried to develop something new and never thought of investigating and comparing the performance across all the current voice assistants. One of the most important things I have learned from this project is that developing new technology is not always the solution when the current state of the art is “good enough” to fulfill user’s needs. This doesn’t mean that we should be satisfied with where we at, but we should focus on what exactly is the problem we are solving before jumping into any new technical solutions.