How voice technology is transforming computing

Composition of human head, key symbol and fractal design elements on the subject of encryption, security, digital communications, science and technology

ANY sufficiently advanced technology, noted Arthur C. Clarke, a British science-fiction writer, is indistinguishable from magic. The fast-emerging technology of voice computing proves his point. Using it is just like casting a spell: say a few words into the air, and a nearby device can grant your wish.

The Amazon Echo, a voice-driven cylindrical computer that sits on a table top and answers to the name Alexa, can call up music tracks and radio stations, tell jokes, answer trivia questions and control smart appliances; even before Christmas it was already resident in about 4% of American households. Voice assistants are proliferating in smartphones, too: Apple’s Siri handles over 2bn commands a week, and 20% of Google searches on Android-powered handsets in America are input by voice. Dictating e-mails and text messages now works reliably enough to be useful. Why type when you can talk?

This is a huge shift. Simple though it may seem, voice has the power to transform computing, by providing a natural means of interaction. Windows, icons and menus, and then touchscreens, were welcomed as more intuitive ways to deal with computers than entering complex keyboard commands. But being able to talk to computers abolishes the need for the abstraction of a “user interface” at all. Just as mobile phones were more than existing phones without wires, and cars were more than carriages without horses, so computers without screens and keyboards have the potential to be more useful, powerful and ubiquitous than people can imagine today.

Voice will not wholly replace other forms of input and output. Sometimes it will remain more convenient to converse with a machine by typing rather than talking (Amazon is said to be working on an Echo device with a built-in screen). But voice is destined to account for a growing share of people’s interactions with the technology around them, from washing machines that tell you how much of the cycle they have left to virtual assistants in corporate call-centres. However, to reach its full potential, the technology requires further breakthroughs—and a resolution of the tricky questions it raises around the trade-off between convenience and privacy.

Alexa, what is deep learning?

Computer-dictation systems have been around for years. But they were unreliable and required lengthy training to learn a specific user’s voice. Computers’ new ability to recognise almost anyone’s speech dependably without training is the latest manifestation of the power of “deep learning”, an artificial-intelligence technique in which a software system is trained using millions of examples, usually culled from the internet. Thanks to deep learning, machines now nearly equal humans in transcription accuracy, computerised translation systems are improving rapidly and text-to-speech systems are becoming less robotic and more natural-sounding. Computers are, in short, getting much better at handling natural language in all its forms (see Technology Quarterly).

Although deep learning means that machines can recognise speech more reliably and talk in a less stilted manner, they still don’t understand the meaning of language. That is the most difficult aspect of the problem and, if voice-driven computing is truly to flourish, one that must be overcome. Computers must be able to understand context in order to maintain a coherent conversation about something, rather than just responding to simple, one-off voice commands, as they mostly do today (“Hey, Siri, set a timer for ten minutes”). Researchers in universities and at companies large and small are working on this very problem, building “bots” that can hold more elaborate conversations about more complex tasks, from retrieving information to advising on mortgages to making travel arrangements. (Amazon is offering a $1m prize for a bot that can converse “coherently and engagingly” for 20 minutes.)

When spells replace spelling

Consumers and regulators also have a role to play in determining how voice computing develops. Even in its current, relatively primitive form, the technology poses a dilemma: voice-driven systems are most useful when they are personalised, and are granted wide access to sources of data such as calendars, e-mails and other sensitive information. That raises privacy and security concerns.

To further complicate matters, many voice-driven devices are always listening, waiting to be activated. Some people are already concerned about the implications of internet-connected microphones listening in every room and from every smartphone. Not all audio is sent to the cloud—devices wait for a trigger phrase (“Alexa”, “OK, Google”, “Hey, Cortana”, or “Hey, Siri”) before they start relaying the user’s voice to the servers that actually handle the requests—but when it comes to storing audio, it is unclear who keeps what and when.

Police investigating a murder in Arkansas, which may have been overheard by an Amazon Echo, have asked the company for access to any audio that might have been captured. Amazon has refused to co-operate, arguing (with the backing of privacy advocates) that the legal status of such requests is unclear. The situation is analogous to Apple’s refusal in 2016 to help FBI investigators unlock a terrorist’s iPhone; both cases highlight the need for rules that specify when and what intrusions into personal privacy are justified in the interests of security.

Consumers will adopt voice computing even if such issues remain unresolved. In many situations voice is far more convenient and natural than any other means of communication. Uniquely, it can also be used while doing something else (driving, working out or walking down the street). It can extend the power of computing to people unable, for one reason or another, to use screens and keyboards. And it could have a dramatic impact not just on computing, but on the use of language itself. Computerised simultaneous translation could render the need to speak a foreign language irrelevant for many people; and in a world where machines can talk, minor languages may be more likely to survive. The arrival of the touchscreen was the last big shift in the way humans interact with computers. The leap to speech matters more. – The Economist

Related Posts
Zimbabwe’s first supercomputer
It’s not quite time yet to be doing “the year in review” articles, but here’s one thing that flew below my radar: Zimbabwe officially launched a 36-teraflop supercomputer in February 2015. The supercomputer, ...
READ MORE
‘Telecoms infrastructure sharing must be fair’
HARARE – Telecommunications companies should find common ground before implementing an infrastructure sharing framework to ensure fair treatment of other players that have already heavily invested in their businesses, a ...
READ MORE
VimpelCom duped in Telecel $40mln deal; vital information withheld
HARARE - National Social Security Authority (NSSA), ZARNet  a wholly owned by the Government withheld vital information from Amsterdam-based VimpelCom Ltd  in an contract marred by irregularities, documents in possession ...
READ MORE
Ethiopia launches space programme
London - Ethiopia, which is receiving £300-million in British aid this year and is one of the world’s poorest nations, is launching its own space programme. The East African country plans ...
READ MORE
Just how much influence does Econet Zimbabwe have over government?
Econet Wireless Zimbabwe has over the years come to be a key player in Zimbabwe’s national political economy. Takura ZhangazhaRecently, a local weekly The Zimbabwe Independent published a story on its ...
READ MORE
Facebook service aimed at professionals to launch in coming months
SAN FRANCISCO - Facebook at Work, Facebook Inc's (FB.O) professional version of its social network, is expected to launch in the coming months, after spending a year in tests, a ...
READ MORE
Telecel Scrambles To Save Operations
TELECEL Zimbabwe, the country’s smallest mobile telecommunication operator, is scrambling to regularise its operating licence, under threat due to non payment of fees, the Financial Gazette has learnt.Sources this week ...
READ MORE
Headquarters of Finnish telecommunication network company Nokia are pictured in Espoo, Finland August 4, 2016. Lehtikuva/Irene Stachon/via REUTERS
Finland's Nokia Corp said on Wednesday it had sued Apple Inc, accusing the iPhone maker of violating 32 technology patents. Apple sued Acacia Research Corp and Conversant Intellectual Property Management Inc on ...
READ MORE
Samsung loses $22bn in two days as Note 7 battery explosions rock company
Samsung has taken a massive $22bn hit to its market value after stock dropped by 11% in two days following further fallout from the battery explosions plaguing its flagship Galaxy ...
READ MORE
Apple unveils iPhone 7
SAN FRANCISCO - Apple Inc unveiled an iPhone 7 with high-resolution cameras and no headphone jack at its annual launch event Wednesday, though the biggest surprise was the debut of ...
READ MORE
Zimbabwe’s first supercomputer
‘Telecoms infrastructure sharing must be fair’
VimpelCom duped in Telecel $40mln deal; vital information
Ethiopia launches space programme
Just how much influence does Econet Zimbabwe have
Facebook service aimed at professionals to launch in
Telecel Scrambles To Save Operations
Nokia sues Apple for infringing technology patents
Samsung loses $22bn in two days as Note
Apple unveils iPhone 7

Arts & Entertainment