Team of researchers of the Department of Telecommunications and Media Informatics of BME have contributed to dubbing of the film The Theory of Everything.
„It was a great challenge to work in the production of the film about Stephen Hawking’s life,” said Géza Németh, associate professor, head of the Speech Technology and Smart Interactions Laboratory. Mafilm, the Hungarian film production company requested BME’s specialists in artificial speech generation to contribute to the Hungarian translation of the approximately forty English sentences produced by the scientist’s device with the same voice generator effect.
Géza Néeth added that he and his colleagues are absolutely familiar with the world famous scientist’s speech generating device but “we have been studying not only the development of his speech synthesizer but all kind of communication devices that facilitates cooperation between human and machine.”m
Stephen Hawking is one of the most famous physicists of the world, not only because of his intellect and works of popular science but due to his rare and incurable disease. At the age of 21 he was diagnosed with a slow-progressing form of amyotrophic lateral sclerosis (ALS), also known as motor neuron disease or Lou Gehrig's disease, which has gradually paralysed him over the decades. At that time his doctors gave him a life expectancy of two to three years. It is very rare that anybody would live longer than 10 years from diagnosing the disease; Stephen Hawking is very special in this sense as well. (The „Ice Bucket Challenge” launched last summer on the social networking websites originally aimed at raising funds for the ALS Association of the United States.) The film on the scientist’s life was produced by the autobiography of his first wife, Jane Wilde and got five Oscar nominations.
Géza Németh told us some surprising details about Professor Hawking. „Now, in 2015 Stephen Hawking uses a speech synthesizer technology that was developed in 1988 at the Massachusetts Institute of Technology based on the research of Denis Klatt and Jonathan Allen,” told Géza Németh. „The technology was called KlattTalk and then DecTalk. The articles about new technologies developed for Hawking are misleading because the operation method, the so called formant synthesis technology has not been changed significantly. There are solutions which produce speech more similar to natural human voice but Hawking has identified with this synthesizer voice, called ’Perfect Paul’. The public has also got used to hearing this specific, robotic-sounding but clearly understandable speech at Hawking’s lectures. It is very interesting that in the fast-changing world of electronics and informatics somebody uses a system, which was developed in the 1980’s. In this issue Hawking is very conservative.”
Scene from movie The Theory of Everything
Formant synthesis technology was developed during several decades as a result of a large number of measurements and analysis of phonetical rules. Formant synthesis does not use human speech samples but parameters such as fundamental frequency, voicing, and noise levels are varied over time to create a waveform of artificial speech. The advantage of the technology is creating understandable human speech; its disadvantage is artificial, robotic-sounding speech and complicated control. The following generations have originated from human speech, the synthesizer prepares the speech output from large databases of recorded speech or its acoustic parameters.
„Semi-spontaneous speech” communication tool of the VoxAid Android version
„In the movie we are shown thirty year-old American technology,” explained the associate professor of BME. „Hungarian developments at that time were at about the same level: in the 1980’s the HungaroVox and MultiVox systems were developed based on the research of the Research Institute for Linguistics of the Hungarian Academy of Sciences and Gábor Olaszy at BME and they produced similar quality to DecTalk.
The requirements in preparing the Hungarian language dubbing to the film was to create the robotic-sounding speech, but MultiVox and HungaroVox systems had not been used for a long time so it would have been more complicated and time consuming to upgrade them rather than using a more modern technology to imitate the sound of the old devices. (MultiVox system has been free to download since 2002.) The researchers selected ProfiVox-dyad software, which has been used by visually impaired persons in Hungary in the Jaws for Windows screen reader programme. “We asked colleagues at Mafilm to specify the approximate length of the sentences and define the pronunciation they prefer,” explained the researcher. “For example the name ‘Elaine’ can be pronounced in different ways.”
Scene from movie The Theory of Everything
The ProfiVox dyad system applies the so called dyad/triad method within the formant synthesis technology and makes segments of the recorded speech. Dyads are two semitones after each other, while triad segments start in the middle of the previous element and finish in the middle of the next one therefore they are two half and one notes long. The system creates the speech from a database of several thousand elements and in the dubbing work it was also optimized. The advantage of the technology is that the speech is clearly understandable and the speed of speech can widely be adjusted. “These are the most important criteria for the developments for the visually impaired as well,” emphasized the researcher. “Visually impaired people who have been using speech synthesizer for a long time sometimes set the device for a speed which is incomprehensible for others. In this system other parameters can also be adjusted flexibly; for example we set the general base frequency deeper so that the sound would be similar to Perfect Paul’s deep tone.”
Scene from movie The Theory of Everything
Géza Németh and his colleagues trust that movie makers will contact them in the future with other tasks as well. There would be opportunities in the field of voice recognition: subtitling could be made easier with an application that shows how long a given text lasts. “However it is even more important that the film draws attention to all the various technologies helping people with disabilities,” emphasized the researcher. “Many people are not aware that very useful applications are available in Hungary as well. The film helps us to understand such problems a little deeper. Some rehabilitation professionals may not know the difference between screen readers and text-to-speech applications even if these are the latest available techniques helping accessibility. One of my degree students is working on the improvement possibilities of the speech synthesizers on android phones. According to the results of his non-representative survey some 70% of the people between 15 and 30 years of age asked do not use such software and may not even heard of them. If this is the case among students who are interested in technological development, what may other age groups know about them?”
In Hungary thousands of people lose their ability to speak temporarily or permanently due to symptoms of stroke or other disease and would need to use a Hungarian language speech generator for a certain period of time. From brain injury to tonsillectomy there are various cases when the patient cannot speak but they can move their hands. Synthesizers and similar devices support communication and help rehabilitation process and the related trainings. “In order to support communication after trauma there has been a technology for 20 years applied for portable computers called VoxAid/MonddKi. Unfortunately the media does not promote it properly. Doctors are aware that the earlier treatment is started the more effective rehabilitation will be and delay may result in worse chances for recovery. Our developments are very useful in this sense as well: the availability of the speech therapist or the doctor is limited but the patient can practice and improve their speech even with a software on their mobile phones. The technology can also take records of the patient’s progress and therefore their status can be monitored and they can be motivated.”
“We find it very important that more and more people should learn about such technologies. This can be one of the messages of the movie about Stephen Hawking’s life,” emphasized Géza Németh, associate professor of the Department of Telecommunications and Media Informatics.
- HA -
Photos: Erik Pintér, Universal