[Nfbf-l] Google is preparing for screenless computers.
Alan Dicey
adicey at bellsouth.net
Sun Sep 15 17:12:41 UTC 2013
Hi All,
For your information. Appended is today's article in Quartz.
This could be interesting from an accessibility perspective as well.
Best wishes,
Peter Meijer
Seeing with Sound - The vOICe
http://www.seeingwithsound.com/winvoice.htm
Google is preparing for screenless computers.
By Christopher Mims.
The spread of computing to every corner of our physical world doesn't just
mean a proliferation of screens large and small -it also means we'll soon
come to rely on mobile computers with no screens at all. "It's now so
inexpensive to have a powerful computing device in my car or lapel, that if
you think about form factors, they won't all have keyboards or screens,"
says Scott Huffman, head of the Conversation Search group at Google.
Google is already moving rapidly to enable voice commands in all of its
products. On mobile phones, Google Now for Android and Google's search app
on the iPhone allow users to search the web via voice, or carry out other
basic functions like sending emails. Similarly, Google Glass would be almost
unusable without voice interaction. At Google's conference for developers,
it unveiled voice control for its Chrome web browser. And Motorola's new
Moto X phone has a specialized microchip that allows the phone to listen at
all times, even when it's asleep, for the magic word that begins every voice
conversation with a Google product: "OK."
There's nothing new about voice interaction with computers per se. What's
different about Google's work on the technology is that the company wants to
make it as fluid and easy as keyboards and touch screens are now. That's a
challenge big enough that, thus far, it has kept voice-based interfaces from
going mainstream in our personal computing devices. And in cases when they
are in use, such as interactive voice response systems designed to handle
customer service calls, they can be frustrating.
Interacting with a computer like it's a friend
"What we're really trying to do is enable a new kind of interaction with
Google where it's more like how you interact with a normal person," says
Huffman.
To illustrate, he picks up his smartphone and says "How far is it from here
to Hearst Castle?"
Normally, getting an answer to such a seemingly simple question would
require googling "Hearst Castle," clicking on a map, and typing in your own
address.
But Huffman's phone gets the answer right on the first try - a neat
illustration of how voice commands can save time and effort. In a way, it's
part of the natural progression of convenience in computer interfaces: 10
years ago writing an email required walking over to a computer, five years
ago we could whip out our phones, and in the near future we'll simply start
talking.
Leveraging what Google already knows about reality
To achieve this kind of apparent simplicity, the Conversation Search group
has to muster everything that Google already knows about the real world.
That's because, as anyone who has discovered that half the battle of
learning a foreign language is absorbing the culture in which it's embedded,
the meaning behind language is always dependent on context.
"One thing that really helps us is the base of all the core relevance and
ranking work that the Google search engine is famous for," says Huffman.
Part of that "relevance" is the Google Knowledge Graph, a database of
people, places and things that allows Google to know, for example, that when
you ask it for "Cruise movies" you are probably asking for the films of Tom
Cruise, rather than "crews movies" or any of a number of other
possibilities.
Beating humans at understanding meaning
This context doesn't just make Google's voice interfaces usable - some day,
it could make them even better than humans. "Today, automatic speech
recognition is not as good as people, but our ambition is, we should be able
to be better than people," says Huffman. In order to achieve that, Google
will leverage the intimate knowledge it has of its users.
"In some sense Google has a lot of context that [a human transcriptionist]
doesn't have," says Huffman. "We know where you are based on your phone's
location and there is some context around what you've been talking about
lately.
Therefore that should help us understand what kinds of things you might be
saying."
Computers that talk back
The future of Google's voice interfaces isn't just accurate interpretation
of commands, but real interaction - hence the "conversation" part of Huffman's
Conversation Search group. One trick Google's voice interface can already do
is understand pronouns like he, she and it. "You can ask yourself why in
language do things like pronouns exist - well, they exist because it lets us
communicate faster than we do without them," says Huffman.
To demonstrate, Huffman follows up his question about how far it is to
Hearst Castle with the sentence "give me directions," which doesn't even
include the pronoun "it," but his phone begins rattling off directions in
its tinny computerized voice, anyway.
All of this is, of course, a demonstration laid out in advance for my
benefit.
And like any other nascent technology it doesn't always work perfectly. At
other points in Huffman's demo, his smartphone fails to understand the
pronouns he's using. One reason for that, he notes, is that Google's voice
interface "forgets" the subject of any conversation with it after a certain
amount of time. Just as in natural conversation, it has a limited attention
span.
In conversation, a human being who has forgotten the referent for a pronoun
like "it" might ask his or her companion what he or she is talking about.
Google's conversation search can't do that yet, but his team is working on
it, says Huffman. Already, Google's regular search results perform a version
of this "can you clarify?" task by suggesting search terms and providing
other disambiguating links at the top of search results. Eventually, Google's
voice search will do the same: "Did you mean the movies of Tom Cruise." or,
given your search history "were you referring to the movies of Penelope
Cruz?"
Fundamentally re-thinking the nature of computer interfaces
At this point, voice commands are a little-used feature of most people's
everyday interactions with computers, if we're using them at all. Between
the present and a future in which we are reliably interacting with computers
by voice alone, there are a number of challenges, some of them fundamental
to what we think of as a computer interface.
One challenge to voice control is simply reliability and error correction.
For example, as Google Glass transcribes your words for an email, text or
social media update, you can actually see the ghostly words hovering in your
field of view, but how does an interface that relies solely on our ears
accomplish the same? Does it read our messages back to us?
Another issue is that current visual computer interfaces limit our options
in ways that can make them easier to use. For example, in graphical user
interfaces we can find out what a program can do by clicking on all of its
buttons and looking under its menus. But commanding a computer by voice is
more like the old model of interaction with a computer - the command line.
It's a potentially powerful interface - Huffman imagines a future in which
we might even communicate with our computers via a verbal short - hand - but
it would require that humans learn a whole new way to control computers, and
learn anew the capabilities of all the software that might be used in this
way.
networks
Ultimately, none of these issues may prove as insurmountable as the ones
that Google has already overcome by virtue of its enormous search database,
knowledge of the real world, cloud computing infrastructure and army of
Ph.D.s who work on voice recognition and natural language processing.
Currently, the everyday magic of understanding voice commands is carried out
almost entirely in the cloud, because processing human speech is difficult
enough that even a sophisticated smartphone doesn't have the processing
power to do it at a high enough level of reliability.
That means voice commands issued to Google's hardware and software are
recorded, shot into the cloud and parsed into next steps, rather than being
handled by the device itself. "For speech recognition, it's a very data
intensive thing," says Huffman. "We use giant neural network things that
are spread across many servers." Which means that when we talk to our
phones, there really is someone listening to our every command - just not
an intelligence we'd recognize as human.
Source URL:
http://qz.com/115304/google-is-preparing-for-screenless-computers/
_______________________________________________
Blind mailing list
Blind at napsa.org.za
More information about the NFBF-L
mailing list