Scientists are getting closer to making it difficult to distinguish between humans and robots, after a team from Columbia University succeeded in developing a robot capable of moving its mouth and facial expressions in a way that mimics humans to an unprecedented degree.
The research team was able to create a robot called "EMO" that can synchronize its lip movements with speech with high accuracy, avoiding what makes robots look unsettling when they approach the human form without perfectly matching it.
The scientists used an innovative method to train the robot, allowing it to observe its reflection in a mirror and learn the relationship between the movement of its 26 silicon facial motors and the expressions it produces. These motors each have ten degrees of freedom.
During the training phase, EMO moved thousands of random expressions in front of the mirror, using an artificial intelligence system known as the "vision-to-action model" (VLA), which translates what it sees into body movements without relying on predefined rules.
The scientists then exposed him to hours of videos showing people speaking and singing in different languages, which helped him associate his facial movements with the spoken sounds, even though he didn't understand their meaning. Eventually, he became able to receive speech in ten languages and synchronize his lips with near-perfect accuracy.
"We had some difficulties with certain sounds that require pursed lips, but the performance improves with time and practice," said Hood Lipson, professor of engineering and director of the Creative Machines Lab at Columbia University.
Prior to the official announcement of the robot, scientists conducted tests on 1,300 volunteers, showing them videos of three different ways to move the EMO's mouth, including the VLA method and two traditional methods, in addition to an ideal reference model.
Participants were asked to choose the clip closest to natural lip movement, and 62.46% of them chose the VLA technique, compared to lower percentages for the other two methods, confirming the superiority of the new model.
The team emphasized that facial expressions play a pivotal role in human communication, as recent studies show that people look at the faces of those they are talking to most of the time, with a noticeable focus on mouth movements, which in turn affect the understanding of voices.
The team believes that ignoring this aspect was a major reason for the failure of previous attempts to produce convincing robots.
Lipson explained that many developers focus on limb movement, while neglecting facial expressions, despite their importance in applications that require direct interaction with humans, such as education, healthcare, and elderly care.
For his part, Yuhang Hu, the lead author of the study, said that robots with expressive faces would be more able to build effective communication relationships with humans, because body language and facial expressions are an essential part of daily interaction.
The details of this achievement were published in a scientific study in the journal Science Robotics.
