Deloitte Insights delivers proprietary research designed to help organizations turn their aspirations into action.

DELOITTE INSIGHTS

  • Home
  • Spotlight
    • Weekly Global Economic Outlook
    • Top 10 Reading Guide
    • Celebrating Earth Month
    • Artificial Intelligence
    • Resilience
  • Topics
    • Strategy
    • Economy & Society
    • Operations
    • Workforce
    • Technology
  • Industries
    • Consumer
    • Energy, Resources, & Industrials
    • Financial Services
    • Government & Public Services
    • Life Sciences & Health Care
    • Technology, Media, & Telecom
  • More from Deloitte Insights
    • About
    • Deloitte Insights Magazine
    • Press Room Podcasts
Deloitte.com
Deloitte Insights logo
  • SPOTLIGHT
    • Weekly Global Economic Outlook
    • Top 10 Reading Guide
    • Celebrating Earth Month
    • Resilience
    • Artificial Intelligence
  • TOPICS
    • Strategy
    • Economy & Society
    • Operations
    • Workforce
    • Technology
  • INDUSTRIES
    • Consumer
    • Energy, Resources, & Industrials
    • Financial Services
    • Government & Public Services
    • Life Sciences & Health Care
    • Technology, Media,& Telecom
  • MORE FROM DELOITTE INSIGHTS
    • About
    • Deloitte Insights Magazine
    • Press Room Podcasts
  • Welcome!

    For personalized content and settings, go to your My Deloitte Dashboard

    Latest Insights

    Creating opportunity at the intersection of climate disruption and regulatory change

    Article
     • 
    7-min read

    Better questions about generative AI

    Article
     • 
    2-min read

    Recommendations

    Tech Trends 2025

    Article

    TMT Predictions 2025

    Article

    About Deloitte Insights

    About Deloitte Insights

    Deloitte Insights Magazine, issue 33

    Magazine

    Topics for you

    • Business Strategy & Growth
    • Leadership
    • Operations
    • Marketing & Sales
    • Diversity, Equity, & Inclusion
    • Emerging Technologies
    • Economy

    Watch & Listen

    Dbriefs

    Stay informed on the issues impacting your business with Deloitte's live webcast series. Gain valuable insights and practical knowledge from our specialists while earning CPE credits.

    Deloitte Insights Podcasts

    Join host Tanya Ott as she interviews influential voices discussing the business trends and challenges that matter most to your business today. 

    Subscribe

    Deloitte Insights Newsletters

    Looking to stay on top of the latest news and trends? With MyDeloitte you'll never miss out on the information you need to lead. Simply link your email or social profile and select the newsletters and alerts that matter most to you.

Welcome back

To join via SSO please click on the key button below
Still not a member? Join My Deloitte

A conversational journey

by Vatatmaja, Jyotirmay Gadewadikar, Sherry Comes, Timothy Murphy
  • Save for later
  • Download
  • Share
    • Share on Facebook
    • Share on Twitter
    • Share on Linkedin
    • Share by email
8 minute read 22 November 2019

A conversational journey How the three Ts of conversational AI build better voice assistants

8 minute read 22 November 2019
  • Vatatmaja United States
  • Jyotirmay Gadewadikar United States
  • Sherry Comes United States
  • Timothy Murphy United States
  • Timothy Murphy United States
  • Save for later
  • Download
  • Share
    • Share on Facebook
    • Share on Twitter
    • Share on Linkedin
    • Share by email
  • Training the voice assistant: Matching human need with AI capabilities
  • Testing the voice assistant: Uncovering the many dimensions of...
  • Tuning for humans: Making conversations flow naturally
  • The ever-improving assistant

The popularity of voice assistants is on the rise, but learning to design an intuitive and effective model is still a work in progress. How can organizations use the three Ts—training, testing, and tuning—to create more human-like voice assistants?

In the first article of our conversational AI series, we explored how the proliferation of voice assistants and messaging platforms are giving way to a new era of user interfaces (see the sidebar, “A five-part series on conversational AI”). Whether it’s in the car, a phone, or a smart home device, nearly 112 million US consumers rely on their voice assistants at least once a month—and that number continues to grow.1

Learn more

Explore the AI and cognitive technologies collection

Download the Deloitte Insights and Dow Jones app

Subscribe to receive more related content

 

Yet the popularity of voice assistants isn’t without its growing pains. These can range from the mundane, such as misinterpreting a request for ordering a roll of paper towel, to the more troubling error of providing a harmful health recommendation (or conversely, providing an accurate, but difficult to interpret recommendation).2 Despite the uptick in adoption of voice-enabled virtual assistants, designing effective products is a nontrivial endeavor. Virtual assistants often deal with multiple, sometimes complex scenarios that require understanding a range of queries to which users expect a quick, accurate, and easily interpretable response.

In our experience, designing an intuitive and effective voice assistant is not as straightforward as combining structured and unstructured data with powerful AI capabilities such as natural language processing (NLP) and machine learning. Instead, virtual voice assistants require designers to match their technical capabilities and resources with human intuition and oversight. Voice assistant design is both art and science. This means incorporating sociological and geographical factors (such as accounting for regional accents), and simultaneously ensuring these voice assistants are properly calibrated to deliver messages in a conversational manner (e.g., proper tone and tenor). In this article, we explore “three Ts” of designing dynamic and flexible voice assistants: training, testing, and tuning.

A five-part series on conversational AI

Over the next year, we will discuss the implications and use cases of conversational AI. In this chapter, we discuss three Ts to developing effective voice assistants. In our remaining chapters, we leverage secondary research and case studies to explore the following topics:

Conversational AI makes its business case: The initial chapter of this series breaks down what constitutes conversational AI and the myriad ways companies can leverage its capabilities.

Acoustic authentication: Explains how conversational systems can enhance security protocols by integrating voice into the multiauthentication process.

Industry use cases: Highlights how virtual assistants appear to be changing the face of customer service in banking, technology, and health care.

The liability of conversational systems: Explores how the more we integrate conversational bots into our work and lives, the more we should take steps to understand their liability in terms of insurance, training, auditing, and the ethical implications.

Training the voice assistant: Matching human need with AI capabilities

There’s a paradox to designing voice assistants. While these assistants are underpinned by advanced AI and NLP capabilities, AI is only “smart” in a very narrow sense—that is, it is most effective at solving well-defined problems.3 But consider the nature of a conversation: It’s free-flowing, words and turns of phrase can take on multiple meanings based on context and tone, and at a moment’s notice, we can jump from one topic to another. So how do designers marry an expansive need, conversational interaction, with a traditionally narrow solution?

Human-assisted trainers. Perhaps, a common misperception is that voice assistants need to be everything to everyone. Instead, most are usually asked to perform relatively specific tasks such as responding to routine call center issues or helping people select an artist from their music library. With this in mind, designers can benefit from working directly with stakeholders to identify requirements and goals. At its core, this means solving well-defined problems that are easily tied to productivity measures (e.g., an airport voice assistant can measure how quickly and accurately it resolves customer queries).

In some of our earlier research, we found some of the best systems are designed directly with the communities that will interact with the AI solutions.4 That is, they benefit from making the human the focal point of the design process (also referred to as keeping the “human in the middle”). In the call center example, this means working with and observing how call center employees interact with customers. What are the routine inquiries? Are there more complex asks that trip employees up? When does confusion arise between employees and customers?

Understanding these common challenges empowers designers to map a high-level process flow of the call fulfillment process. As demonstrated in figure 1, these mappings create the underlying foundation for recording and organizing calls into a manageable data set populated with keywords and phrases.

Transcribing call center conversations for model training

Indeed, figure 1 is a simplification of the data structure, but after the designers are able to properly categorize these conversations, millions of recorded conversations can be translated into text and processed through mappings similar to this example.

Training the right data for your AI solution. After designers map the high-level process flow, numerous data sources are processed to train the voice assistants. This starts with transcribing voice data to text and parsing it into “human utterances.” These utterances consist of speech broken up by pauses in conversation. These range from single words to clauses to complete sentences. As seen in figure 1, utterances could be structured into business issues and resolutions.

After transforming the unstructured text into structured utterances, machine learning techniques, such as clustering analysis, create incredibly granular groupings within the data to uncover common patterns in the conversation. At this point, more supervised algorithms provide confidence scores that subject matter experts can validate and, when appropriate, use to correct machine learning conclusions. Taken together, putting humans in the middle, coupled with machine learning, creates foundational insights that inform these prospective voice assistants.

Testing the voice assistant: Uncovering the many dimensions of “accuracy”

Testing a conversational system, such as a voice assistant, is more than ensuring that business issues are correctly mapped to resolutions. As many of us know from our own experiences, one-to-one conversations can easily be misinterpreted. If we aren’t familiar with an accent, we may misunderstand a question or if we are speaking to someone from a different geographical location, words can take on different meanings (for instance, “chaps” can mean a good friend or something a cowboy wears). Conversational systems are no different—except, unlike us, they lack the ability to understand context.

For these reasons, designers should build quality assurance metrics that stress-test their models across a number of user personas, including:

  • Variations in geography. Like our above examples, this consists of validating that the system can accurately interpret accents and contextually understand keyword meanings across groups. Taking this further, it may mean testing the model across multiple languages.
  • Historical contexts. The models typically work best when they incorporate past conversations. If a prior resolution didn’t properly address an issue, then it can come off as tone-deaf if the model recommends the same solution again.
  • Adaptable to real-life situations. Voice assistants benefit from testing in real-world situations. For instance, can the voice assistant cut through the background noise of the morning commute on the subway?
  • Behavioral modeling. How we say something impacts action—that is, conversational systems don’t naturally have good bedside manners (e.g., telling someone they have a low balance on their checking account can probably benefit from a delicate delivery). Instead, it’s on the designer to ensure the responses are said in a natural and pleasant manner that users will be open to accepting.

All four dimensions show the importance of uncovering and accounting for implicit bias. If the algorithm doesn’t understand a specific accent, then it could be trained on a biased data set. In this case, the designers should work back to the training data to create a more inclusive design. Fortunately, the testing process can help bring these issues to light.

Tuning for humans: Making conversations flow naturally

Voice assistants do not have to pass as humans, but they should be able to communicate in a pleasant and interpretable manner. In this spirit, designers can improve upon their voice assistants by tuning their models with a more natural delivery. Tuning a voice assistant includes:

  • Pronunciation. Designers should build a pronunciation dictionary that standardizes the speech of reoccurring words.5 This reinforces the importance of focusing the goal of each voice assistant design to ensure a more manageable universe of words.
  • Pauses. How pauses are deployed, both in their placement and duration, influence how natural a conversation sounds.6
  • Pitch and pace. Since many languages, such as English, are atonal, the pitch and pace of words and sentences often convey a speaker’s feelings.7 For instance, rising intonations in the middle of a sentence indicate a speaker isn’t done talking, even if it’s followed by a pause. Further, a fast speaking pace can represent excitement, while a slower pace may indicate a more relaxed feel.

These natural changes in prosody work in concert to make conversations more natural and inviting. And with the help of virtual assistants, designers can deliver helpful conversations at scale.

The ever-improving assistant

Building an accurate and natural voice assistant is an iterative process. While we start with training, it doesn’t end with testing, and then tuning. Instead, each part of the process builds and iterates on the other. Implicit biases can occur during training, but testing can help designers uncover and address these biases; and if pauses are inappropriate, then the training data should be restructured to properly account for these natural breaks in conversation.

When designing your own voice assistants, remember:

  1. The business objective should dictate the design.
  2. Training, testing, and tuning are a dynamic process, with each step informing the other.
  3. The work is never done. This is an iterative process, where the former version continually informs and improves upon future releases.

By establishing a well-articulated goal, designers can continually improve upon their voice assistants to sound a bit more human with each iteration.

Acknowledgments

The authors would like to thank Scott Pobiner of Deloitte Consulting LLP for his contributions to this series.

Cover image by: Neil Webb

Endnotes
    1. Victoria Petrock, “US voice assistants users 2019: Who, what, when, where and why,” eMarketer, July 15, 2019. View in article

    2. Lauren Goode, “Your voice assistant may be getting smarter, but it’s still awkward,” Wired, December 27, 2018. View in article

    3. Jim Guszcza, “Smarter together: Why artificial intelligence needs human-centered design,” Deloitte Review 22, January 22, 2018. View in article

    4. Dr. Scott Pobiner and Timothy Murphy, From smart products to smart systems: The importance of participatory design in the age of artificial intelligence, Deloitte Insights, December 11, 2018. View in article

    5. Pearl, 2016. View in article

    6. Cohen, 2004. View in article

    7. Reedy, 2015. View in article

Show moreShow less

Topics in this article

Telecommunications , Center for Technology, Media & Telecommunications

Deloitte Analytics and AI

Achieving your business outcomes, whether a small-scale program or an enterprise-wide initiative, demands ever-smarter insights—delivered faster than ever before. Doing that in today's complex, connected world requires the ability to combine a high-performance blend of humans with machines, automation with intelligence, and business analytics with data science. Welcome to the Age of With, where Deloitte translates the science of analytics—through our services, solutions, and capabilities—into reality for your business.

Learn more
Get on touch
Contact
  • Sherry Comes
  • Managing director, Applied AI practice, Conversational AI leader
  • Deloitte Consulting LLP
  • scomes@deloitte.com
  • +1 720 325 3757
Download Subscribe

Related content

img Trending

Interactive 3 days ago

Explore more on AI and cognitive technologies

  • Conversation starters Article5 years ago
  • Intelligent interfaces Article6 years ago
  • Smart speakers: Growth at a discount Article6 years ago
  • Beyond marketing: Experience reimagined Article6 years ago
  • Artificial intelligence: From expert-only to everywhere Article6 years ago
  • Automation with intelligence Article5 years ago
Vatatmaja

Vatatmaja

Vatatmaja is a specialist leader in Deloitte’s Applied AI group. He is a quintessential IT professional, focused on cognitive computing, AI, deep learning, and emerging technologies, applying the knowledge to find interesting business solutions that improve productivity measures. He has frequently synthesized and recognized abstract patterns, facts, theories, trends, inferences, relationships, key issues, and themes in complex and variable unrelated situations, while solving client business problems.

  • vatatmaja@deloitte.com
Jyotirmay Gadewadikar

Jyotirmay Gadewadikar

Jyotirmay Gadewadikar is a manager at Deloitte in the Applied AI group. He helps enterprises make strategic decisions with AI and analytics and is a recipient of the Department of Homeland Security’s Scientific Leadership Award. Gadewadikar has led global teams of data scientists, business analysts, software developers, and client stakeholders to conceptualize, design, and implement AI-enabled customized solutions through the analysis of available technology platforms, evangelization of supervised and unsupervised machine learning algorithms, and natural language processing and understanding methods.

  • jgadewadikar@deloitte.com
Sherry Comes

Sherry Comes

Sherry Comes is a managing director at Deloitte, in the Applied AI group. She specializes in the areas of voice solutions, AI, natural language processing, sentiment analysis, analytics, data science, and machine learning. Her innovative approach has won her innovation awards, and has helped her lead, and be an integral part of, many ground-breaking advancements, such as being the first person to bring AI solutions to Africa as a Distinguished Engineer at IBM Watson. She has done extensive work around creating voice virtual assistants in financial services and has also received a number of patents.

  • scomes@deloitte.com
Timothy Murphy

Timothy Murphy

Senior Manager | Enterprise Growth & Innovation

Tim Murphy is a senior manager in the Deloitte Center for Integrated Research where he leads research that helps build organizational resilience, overcome current business challenges, and be prepared for the disruptions of tomorrow. As a researcher and analytical scientist with Deloitte Global, he focuses on understanding how organizations are embedding resilience across the enterprise, including supply chains, talent models, and strategy.

  • timurphy@deloitte.com
  • +1 414 977 2252

Share article highlights

See something interesting? Simply select text and choose how to share it:

Email a customized link that shows your highlighted text.
Copy a customized link that shows your highlighted text.
Copy your highlighted text.

A conversational journey has been saved

A conversational journey has been removed

An Article Titled A conversational journey already exists in Saved items

Invalid special characters found 
Forgot password

To stay logged in, change your functional cookie settings.

OR

Social login not available on Microsoft Edge browser at this time.

Connect Accounts

Connect your social accounts

This is the first time you have logged in with a social network.

You have previously logged in with a different account. To link your accounts, please re-authenticate.

Log in with an existing social network:

To connect with your existing account, please enter your password:

OR

Log in with an existing site account:

To connect with your existing account, please enter your password:

Forgot password

Subscribe

to receive more business insights, analysis, and perspectives from Deloitte Insights
✓ Link copied to clipboard

Deloitte Insights delivers proprietary research designed to help organizations turn their aspirations into action.

Deloitte Insights

  • Home
  • Topics
  • Industries
  • About Deloitte Insights

Spotlight

  • Weekly Global Economic Outlook
  • Top 10 Reading Guide
  • Celebrating Earth Month
  • Artificial Intelligence
  • Resilience
Deloitte logo

Learn about Deloitte’s offerings, people, and culture as a global provider of audit, assurance, consulting, financial advisory, risk advisory, tax, and related services.

  • Terms of Use
  • Privacy
  • Privacy Shield
  • Cookies
  • Legal Information for Job Seekers
  • Labor Condition Applications
  • Do Not Sell My Personal Information