Google wants to reverse the AI ​​"read lip" software accuracy is 3 times higher than humans

In March of this year, AlphaGo, the artificial intelligence system developed by Google DeepMind, defeated world champion Li Shishi, occupying the headlines of major technology media. At that time, it was a beautiful scenery.

DeepMind, a Google artificial intelligence division, is working with Oxford University researchers to develop the world's most advanced "read lip" software, which may be better than human "read lip".

To achieve this goal, the researchers selected thousands of hours of BBC TV shorts, uploaded them to a neural network, and trained their "read lip" software to identify their broadcast content based on the movement of the anchor's mouth.

As a result, the lip reading accuracy of this “read lip” software is as high as 46.8%. In contrast, based on the same test content, the accuracy of human lip reading is only 12.4%.

The study is based on an artificial intelligence "read lip" system "LipNet" from Oxford University. LipNet can match the movement of the characters in the video with their lines, with an accuracy rate of 93.4%. Of course, this accuracy is based primarily on some relatively simple sentences.

DeepMind's "read lip" software is called "Watch, Listen, Attend, and Spell". Unlike LipNet, DeepMind's software is dedicated to harder sentences.

To this end, Google Neural Network watched about 5,000 hours of popular TV programs from the BBC, including "Night News" and "Question Time" and "Today's World", including a total of 110,000 different sentences, 17,500 Different words. In contrast, the sentence that tests LipNet contains only 51 different words.

Google said: "The purpose of this study is to identify phrases and sentences that people talk with, with or without sound. Unlike previous studies, they are limited to a limited number of words or phrases, and our Research is directed at unrestricted natural language long sentences."

The DeepMind team believes that the newly developed software, in addition to helping hearing-impaired people, supports a range of other applications, including annotating movies, using lip movements and communicating with digital assistants such as Siri and Alexa.

Face Plate

Face Plate,electrical face plates,cat6 faceplate,face plates


This entry was posted in on