This report is no longer available. Click here to view our current reports or contact us to discuss a custom report.

If you have previously purchased this report then please use the download links on the right to download the files.

Voice, Speech, Conversation-Based User Interfaces 2019-2029: Technologies, Players, Markets

Name: Voice, Speech, Conversation-Based User Interfaces 2019-2029: Technologies, Players, Markets
Brand: IDTechEx

Smart Voice, Smart Speakers, Voice Assistant, Voice-Enabled User Interface

By Dr Xiaoxi He

Show All Description Contents, Table & Figures List FAQs Pricing

Natural human-machine interface is shaping our life

From punch cards, to keyboards, from mouse to touch screens, technologies have shaped the way how humans interact with machines. "Human-machine interface" (HMI) began as "computer interface". That is because early computers were not interactive and gradually "human-machine interface" became "human-machine interaction". "Interaction" is the first revolution that occurred in the development of human-machine interface. Now we are experiencing the transition to "natural user interface", which is considered to be the second revolution of HMI.

Machines/computers can interpret natural human communication and they communicate more like humans.

Compared with keyboards and mouses, touch is considered as a natural interaction. Apart from touch, audio and vision modalities can also provide new ways of interaction.

Figure1 Evolution of human-machine interactions

Source: IDTechEx

Speech/Voice-based interaction

Speech enables a convenient integration. It is hands-free, eyes-free and keyboard-free. As talking is natural for most of us and it does not require us to learn new skills, the learning curve is low. Humans can speak 150 words on average per minute compared with 40 when typing. Speech interaction can be quickly mastered by young generations, old people, disabled people and illiterate people. It can also be applied in occasions and devices where common interactions are challenging such as while driving, without light, or in extremely small wearables. These advantages make speech an increasingly popular media for devices and applications.

Speech recognition (SR) is the "ear" of a machine, which is the basis for speech user interface as the input enables the whole interaction process. Speech recognition was first introduced in the 1920s when a toy dog "Radio Rex" could come when his name was called. Speech user interface was also applied in vehicles in early times. However, the poor recognition accuracy and bad user experiences stopped it going further. Since 1993, the accuracy of speech recognition had been stagnated around 70% based on traditional model, which led to the poor user experiences as users could easily get frustrated and lose patience during the process. It was machine learning, or more specifically, deep learning, that significantly increased the accuracy of speech recognition in 2010s when they have been proved to be effective in improving the recognition accuracy. In 2016, Microsoft reported a speech recognition system reached human parity with a word error rate of 5.9% and in 2017, Google reported an accuracy of 95%. The technology improvement indicates that machines can be as good as human beings in terms of "hearing" and now speech recognition has become a commodity.

Giants such as Apple, Amazon, IBM, Google and Microsoft, all have efforts on smart speech. Besides the "ear", it is also vital for the machines to have the "brain", "mouth" and other organs to realize natural language speech interactions. In this process, emerging technologies and business models are established.

Speech is language-dependent, making the global market more complicated and segmented. However, the general focus of global market is English-centric interactions, with a few popular language systems developed by local players due to their strengths in language data.

Figure 2 Spoken dialogue system processes

Source: IDTechEx

This report provides an introduction of different technologies from both hardware and software point of view from the scratch. They are listed as following:

Voice-enabled smart speakers
Microphone arrays
MEMS speakers
Voice system on chip
Machine learning
Front-end signal processing
Key word spotting
Automatic speech recognition
Natural language understanding
Speech synthesis
Voice recognition
Machine translation

Figure 3 Value chain

Source: IDTechEx

Market landscape, business models and value chains are analysed in this report, with a ten-year market forecast on the angle of revenue model and applications for the following sectors:

Automotive
Banking, financial and insurance
Healthcare
Travel, hotels
Retail/commerce
Home automation
Education
Game & entertainment
Voice-enabled smart speakers

Analyst access from IDTechEx

All report purchases include up to 30 minutes telephone time with an expert analyst who will help you link key findings in the report to the business issues you're addressing. This needs to be used within three months of purchasing the report.

Further information

If you have any questions about this report, please do not hesitate to contact our report team at research@IDTechEx.com or call one of our sales managers:

AMERICAS (USA): +1 617 577 7890

ASIA (Japan and Korea): +81 3 3216 7209

ASIA: +44 1223 810259

EUROPE (UK) +44 1223 812300

Table of Contents

1.	EXECUTIVE SUMMARY
1.1.	Transition of human-machine interface
1.2.	Is the times of natural language interaction coming?
1.3.	Why natural language UI is disruptive?
1.4.	Driving force
1.5.	Influence of speech UI
1.6.	Market demand of speech technologies
1.7.	Entry barriers
1.8.	SWOT analysis of speech UI industry: strengths
1.9.	SWOT analysis of speech UI industry: weaknesses
1.10.	SWOT analysis of speech UI industry: opportunities
1.11.	SWOT analysis of speech UI industry: threats
1.12.	Profit level
1.13.	Product life
1.14.	The cards in giants' hands—Google, Microsoft, Amazon, Facebook, Apple, IBM
1.15.	Giants' activities
1.16.	Popular development models in speech-related business
1.17.	Technology trend
1.18.	Hype or hope
1.19.	Value chain
1.20.	Changes in the value chain
1.21.	Open-loop system or not
1.22.	Revenue models of speech products
1.23.	Market forecasts - assumptions & methodology
1.24.	Market forecasts 2018-2029 by revenue channel
1.25.	2019 & 2029 market values by revenue channel
1.26.	Analysis of market forecast 2019-2029 by revenue channel
1.27.	Market forecasts 2018-2029 by application
1.28.	2019 & 2029 market values by application
1.29.	Analysis of market forecast 2018-2029 by application
2.	INTRODUCTION
2.1.	Evolution of human-machine interactions
2.2.	Natural user interface
2.3.	Questions about natural user interface
2.4.	Overview of speech UI
2.5.	Voice interaction products at a glance
2.6.	User interface and application programming interface
2.7.	Speech: alternative to keyboard
2.8.	Evolution of speech user interface
2.9.	Benefited from high speech recognition accuracy
2.10.	Timeline of speech recognition error rate
2.11.	Human parity has been achieved
2.12.	Voice search is taking an increasing share
2.13.	Reasons for using voice
3.	SMART SPEAKERS
3.1.	Timeline of smart speaker release
3.2.	Voice-activated smart speaker product list
3.3.	Amazon Echo
3.4.	Amazon Echo Dot
3.5.	Alexa devices
3.6.	From Google Now to Google Home
3.7.	Google Home teardown
3.8.	Comparison of Amazon Echo and Google Home
3.9.	Apple HomePod
3.10.	Little Fish powered by Baidu
3.11.	Levono
3.12.	Smart speaker comes as voice activated home hubs
3.13.	The success of Amazon Echo
3.14.	Amazon Alexa
3.15.	Integration and centralization
3.16.	Amazon Web Services
3.17.	The numbers behind Amazon Echo
3.18.	Surveys around Amazon Echo
3.19.	Things work with Amazon Alexa: smart home
3.20.	Things work with Amazon Alexa: other devices and service
3.21.	What do developers and users want Amazon Alexa for
3.22.	Competition strategies
3.23.	Move away from hardware sales
3.24.	Interoperability between Amazon, Apple & Google ecosystems
3.25.	Smart speaker market status
3.26.	Estimated sales of major voice-activated smart speakers
3.27.	Smart speaker market forecast
4.	TECHNOLOGY
4.1.	Speech technologies
4.2.	Smart speaker core components
4.3.	Smart speaker hardware: speaker design
4.4.	Smart speaker hardware: circuit board, communication and battery
4.5.	Microphone Arrays
4.6.	Amazon Echo's 6+1 microphone array
4.7.	AISpeech's microphone array solutions
4.8.	Ding Dong R7+1 microphone array
4.9.	Microphone array trends
4.10.	MEMS microphones
4.11.	MEMS microphone leaders
4.12.	Voice System on Chip for Terminals
4.13.	Voice SoC features
4.14.	AI Voice SoC
4.15.	From voice to voice AI SoC
4.16.	Evolution of SoC for voice assistant technologies
4.17.	Voice SoC companies
4.18.	UniOne
4.19.	Hangzhou Guoxin Technology
4.20.	MIT's low-power chip for speech recognition
4.21.	Artificial Intelligence and Deep Learning
4.22.	From artificial intelligence, to machine learning and deep learning
4.23.	Artificial intelligence in the development of human-machine interactions
4.24.	Terminologies and scopes
4.25.	Things improved deep learning
4.26.	Rising interest in google trends
4.27.	An artificial neuron in the training process
4.28.	Artificial neural network
4.29.	Deep learning
4.30.	The age of gradient descent
4.31.	Main varieties of machine learning approaches
4.32.	Evolution of deep learning
4.33.	Dialogue Systems
4.34.	Types of dialogue systems
4.35.	Spoken dialogue system processes
4.36.	Development stage of speech processing technologies
4.37.	Front-End Signal Processing
4.38.	Front-end processing for speech recognition
4.39.	Voice activity detection
4.40.	Acoustic echo cancellation
4.41.	Dereverberation
4.42.	Beamforming
4.43.	Sensors for voice biometrics: VocalZoom
4.44.	VocalZoom used in cars
4.45.	Humidity sensor with carbon nanotubes for biometric sensing
4.46.	Algorithm-based approach
4.47.	Keyword Spotting (KWS)
4.48.	Keyword spotting
4.49.	LVCSR KWS
4.50.	Acoustic KWS
4.51.	Phonetic search KWS
4.52.	Automatic Speech Recognition (ASR)
4.53.	Speech recognition
4.54.	Timeline of language technologies
4.55.	Approaches to and types of speech recognition
4.56.	Evolution of speech recognition
4.57.	Modern speech recognition processes
4.58.	Feature extraction methods
4.59.	Challenges in speech recognition
4.60.	Speech technology of Baidu: roadmap of speech recognition in Baidu
4.61.	Natural Language Processing (NLP) and Natural Language Understanding (NLU)
4.62.	Natural language processing and natural language understanding
4.63.	Levels of linguistic analyses
4.64.	Natural language understanding
4.65.	Natural language understanding system
4.66.	Knowledge sources for speech understanding
4.67.	Text-To-Speech (TTS)
4.68.	Text-to-speech system
4.69.	Amazon's "Polly" synthesiser
4.70.	DeepMind of google
4.71.	VoicePrint Recognition (VPR)
4.72.	Different voice/sound prints
4.73.	Voiceprint recognition
4.74.	Speech recognition vs. voice recognition
4.75.	Challenges
4.76.	Voice recognition process
4.77.	VPR procedure
4.78.	Information security
4.79.	Biometrics in finance
4.80.	New Zealand government using voice biometrics for telephone system
4.81.	Siri of Apple
4.82.	Representative players
4.83.	Emotion detection
4.84.	Machine Translation
4.85.	Translation approaching human level performance
4.86.	Machine translation
4.87.	Speech translation
4.88.	Microsoft: deep learning for machine translation
5.	VERTICAL APPLICATIONAL MARKETS AND RELEVANT PLAYERS
5.1.	Speech UI enables many applications
5.2.	Role of speech in different devices
5.3.	Applications
5.4.	Dictation
5.5.	Information security
5.6.	Interactive voice response
5.7.	IVR value propositions
5.8.	IVR case studies
5.9.	Automotive
5.10.	Speech-user-interface-enabled functions for automotive
5.11.	Development roadmap of speech UI in automotive
5.12.	Speech-based in-vehicle system case studies
5.13.	Speech recognition used in intoxication measurements
5.14.	Banking, Financial services and Insurance (BFSI)
5.15.	Healthcare and life sciences
5.16.	Speech translation device
5.17.	Healthcare apps using Amazon Alexa
5.18.	Health information at home through voice technology
5.19.	Hospitals look to Amazon Alexa
5.20.	Alexa-powered AI genomics platform
5.21.	Travel, hotels
5.22.	Retails/commerce
5.23.	Home automation
5.24.	Education
5.25.	iFlytek's product portfolio
5.26.	Game & entertainment
5.27.	TV solutions
5.28.	Robotics
5.29.	Virtual personal assistant
5.30.	Towards VPA
5.31.	Conversational interaction illustration for VPAs
5.32.	Exploring Business models for virtual personal assistants
5.33.	Siri of Apple
5.34.	Evolution of iPhone's speech user interface
5.35.	VocalIQ
5.36.	Future Siri
5.37.	Microsoft Cortana
5.38.	Technologies involved with Cortana
5.39.	IBM Watson
5.40.	Preparation for Watson: partnerships and acquisitions
5.41.	A list of virtual assistants
5.42.	Comparison of intelligent virtual assistants
5.43.	Open access of Google SR API and AudioSet
5.44.	Viv
5.45.	Chatbot
5.46.	Messaging interfaces of chatbots
5.47.	Facebook's M
5.48.	Bot platforms with AI
5.49.	Virtual idol enabled by speech synthesis
5.50.	Revenue models of Vocaloid
5.51.	Wearables
5.52.	Intel: from Javis to Radar Pace
5.53.	Kopin's voice interface
5.54.	Whisper™ Chip
6.	PLAYERS
6.1.	The contestants
6.2.	Case study: The decline and reposition of Nuance—the formerly leader in speech
6.3.	Lists of players in the value chain and technology offerings
7.	COMPANY PROFILES
7.1.	AISpeech
7.2.	Amazon (Alexa)
7.3.	Beijing Kexin Technology
7.4.	d-Ear Technologies
7.5.	iFlyTek
7.6.	MindMeld
7.7.	Next IT Corporation
7.8.	Nuance Communications
7.9.	Unisound

Frequently Asked Questions

About IDTechEx reports

What are the qualifications of the people conducting IDTechEx research?

Content produced by IDTechEx is researched and written by our technical analysts, each with a PhD or master's degree in their specialist field, and all of whom are employees. All our analysts are well-connected in their fields, intensively covering their sectors, revealing hard-to-find information you can trust.

How does IDTechEx gather data for its reports?

By directly interviewing and profiling companies across the supply chain. IDTechEx analysts interview companies by engaging directly with senior management and technology development executives across the supply chain, leading to revealing insights that may otherwise be inaccessible.

Further, as a global team, we travel extensively to industry events and companies to conduct in-depth, face-to-face interviews. We also engage with industry associations and follow public company filings as secondary sources. We conduct patent analysis and track regulatory changes and incentives. We consistently build on our decades-long research of emerging technologies.

We assess emerging technologies against existing solutions, evaluate market demand and provide data-driven forecasts based on our models. This provides a clear, unbiased outlook on the future of each technology or industry that we cover.

What is your forecast methodology?

We take into account the following information and data points where relevant to create our forecasts:

Historic data, based on our own databases of products, companies' sales data, information from associations, company reports and validation of our prior market figures with companies in the industry.
Current and announced manufacturing capacities
Company production targets
Direct input from companies as we interview them as to their growth expectations, moderated by our analysts
Planned or active government incentives and regulations
Assessment of the capabilities and price of the technology based on our benchmarking over the forecast period, versus that of competitive solutions
Teardown data (e.g. to assess volume of materials used)
From a top-down view: the total addressable market
Forecasts can be based on an s-curve methodology where appropriate, taking into account the above factors
Key assumptions and discussion of what can impact the forecast are covered in the report.

How can I be confident about the quality of work in IDTechEx reports?

Based on our technical analysts and their research methodology, for over 25 years our work has regularly received superb feedback from our global clients. Our research business has grown year-on-year.

Recent customer feedback includes:

"It's my first go-to platform"

- Dr. Didi Xu, Head of Foresight - Future Technologies, Freudenberg Technology Innovation

"Their expertise allows us to make data-driven, strategic decisions and ensures we remain aligned with the latest trends and opportunities in the market."

- Ralf Hug, Global Head of Product Management & Marketing, Marquardt

What differentiates IDTechEx reports?

Our team of in-house technical analysts immerse themselves in industries over many years, building deep expertise and engaging directly with key industry players to uncover hard-to-find insights. We appraise technologies in the landscape of competitive solutions and then assess their market demand based on voice-of-the-customer feedback, all from an impartial point of view. This approach delivers exceptional value to our customers—providing high-quality independent content while saving customers time, resources, and money.

Why should we pick IDTechEx research over AI research?

A crucial value of IDTechEx research is that it provides information, assessments and forecasts based on interviews with key people in the industry, assessed by technical experts. AI is trained only on content publicly available on the web, which may not be reliable, in depth, nor contain the latest insights based on the experience of those actively involved in a technology or industry, despite the confident prose.

How can I justify the ROI of this report?

Consider the cost of the IDTechEx report versus the time and resources required to gather the same quality of insights yourself. IDTechEx analysts have built up an extensive contact network over many years; we invest in attending key events and interviewing companies around the world; and our analysts are trained in appraising technologies and markets.

Each report provides an independent, expert-led technical and market appraisal, giving you access to actionable information immediately, rather than you having to spend months or years on your own market research.

Can I speak to analysts about the report content?

All report purchases include up to 30 minutes of telephone time with an expert analyst who will help you link key findings in the report to the business issues you're addressing. This needs to be used within three months of purchasing the report.

What is the difference between a report and subscription?

A subscription from IDTechEx can include more reports, access to an online information platform with continuously updated information from our analysts, and access to analysts directly.

Before purchasing, I have some questions about the report, can I speak to someone?

Please email research@idtechex.com stating your location and we will quickly respond.

About IDTechEx

Who are IDTechEx's customers?

IDTechEx has served over 35,000 customers globally. These range from large corporations to ambitious start-ups, and from Governments to research centers. Our customers use our work to make informed decisions and save time and resources.

Where is IDTechEx established?

IDTechEx was established in 1999, and is headquartered in Cambridge, UK. Since then, the company has significantly expanded and operates globally, having served customers in over 80 countries. Subsidiary companies are based in the USA, Germany and Japan.

Questions about purchasing a report

How do I pay?

In most locations reports can be purchased by credit card, or else by direct bank payment.

How and when do I receive access to IDTechEx reports?

When paying successfully by credit card, reports can be accessed immediately. For new customers, when paying by bank transfer, reports will usually be released when the payment is received. Report access will be notified by email.

How do I assign additional users to the report?

Users can be assigned in the report ordering process, or at a later time by email.

Can I speak to someone about purchasing a report?

Please email research@idtechex.com stating your location and we will quickly respond.

The market for smart speech/voice-based technology will reach $ 15.5 billion by 2029

Voice, Speech, Conversation-Based User Interfaces 2019-2029: Technologies, Players, Markets

Report Statistics

Slides	313
Forecasts to	2029

Customer Testimonial

"The resources produced by IDTechEx are a valuable tool... Their insights and analyses provide a strong foundation for making informed, evidence-based decisions. By using their expertise, we are better positioned to align our strategies with emerging opportunities."

Director of Market Strategy
Centre for Process Innovation (CPI)

Subscription Enquiry

Name Email Phone Company Country

US State

Additional Information