This New AI Can Predict A Person’s Face Based Just On Their Voice

Jannat Un Nisa

4 years ago

Speech2Face is a new AI technology developed by AI researchers that can predict a person’s face only by listening to their voice.

The Massachusetts Institute of Technology (MIT) addressed the question of whether a person’s looks might be deduced from their speech. Therefore, they started a project to train an algorithm to be able to build a person’s most identifying physical traits only by listening to them speak.

“Our model is designed to reveal statistical correlations that exist between facial features and voices of speakers in the training data,” the creators of Speech2Face said.

“The training data we use is a collection of educational videos from YouTube and does not represent equally the entire world population. Therefore, the model—as is the case with any machine learning model—is affected by this uneven distribution of data.”

The AI they developed is called Speech2Face, and it can create synthetic facial characteristics that are quite similar to those of a human by recreating a few seconds of his speech.

But how does this technology actually work?

Speech2Face is based on a neural network system that recognizes human traits, including ethnicity, age, and gender. As a result, speech2Face gathered a range of references that allowed him to produce a face without using an image after training her to grasp the links between the speech and the face of thousands of people in YouTube videos.

The absolute staggering aspect of AI is its ability to produce virtual faces that are quite close to those of humans. They are, however, not as precise as those derived using artificial intelligence that compares synthetic faces to images of actual faces. The goal of Speech2Face is to build a picture that recovers the physical aspects associated with speech.

Speech2Face produces an avatar from a speech, which is different from what AI does. They usually create AI using a learning system with features similar to real people that humans can’t differentiate.

We may be able to benefit in the future from this new Intelligence. The use of basic audio to construct criminal profiles is one of the most effective uses.

Unfortunately, there are certain disadvantages as well. For example, the ease with which one may make a face has the potential to be used to impersonate someone. The introduction of this trained AI, on the other hand, is a tremendous technological advancement.

Related Articles