Gboard: Speech recognition improves and works offline, but not for everyone


The voice transcription system of the Gboard keyboard improves, thanks to a neural network directly integrated with the phones. These improvements are unfortunately reserved for Pixel users with an English keyboard.
While previously Internet access was necessary for Gboard to transcribe voice in text, this is no longer the case. In a publication of the Google AI blog, members of this division unveil the new features on Google's keyboard. New features that are currently reserved for Google Pixel users with an English keyboard. No doubt we should find them on more devices in the future.
Integrated voice recognition on phones
Google engineers have been able to compress and integrate a neural network intended to transcribe a physical signal – here the voice – into a digital signal. This technology is called RNN-T, and reduces the size of the models used for transcription: they went from 2 GB to 80 MB.
            
                
                
                    
                        Change the keyboard of your smartphone and go for the official Google solution. Gboard combines just about everything other keyboards …
                    
                    
                        3 reasons to download this application
                        Integrated google services
                                                                Customization themes
                                                                GIF and emojis
                                                        
                
            
A consequent reduction, which will allow the phones to be directly equipped with this feature. Usually, you had to be connected to the Internet to send the sound recording to Google's servers, which transcribed it and sent it back to the phone. A round trip of information that takes time, and increases the risk of data interception.
A more fluid transcription
The current transcription is rather choppy: even if it recognizes what we say, the words appear in groups of 3 or 4 and are far from simulating a fluid flow of words. The fault is data that plays on yo-yo between the phone and Google's servers.

By integrating directly with the phone, the transcription is more fluid. The words are displayed one after the other and stick much better to the rhythm of our words. In any case what shows this GIF shared by Google teams to show the difference between old and new transcripts.
It is with all the data that Google has through its users that features like these are also effective. The Mountain View company is now accustomed to integrating machine learning into many of its services, such as Google Maps augmented reality browsing very recently, and it succeeds.
Read on FrAndroid: Google Maps: augmented reality navigation is finally available, we tested it

    
    
        
    

Laisser un commentaire

Votre adresse de messagerie ne sera pas publiée. Les champs obligatoires sont indiqués avec *