The voice you hear when you are using GPS or when your virtual assistant speaks to you; those are not real voices. Simply put, there is no database of voices from where the computer selects words and then puts them together for creating sentences. The voice is generated on the go via the computer. Have you ever considered how this has been achieved? The mentioned voices are so real-like that it is hard to believe that a person is not speaking, right? The world’s first talking machine was called Voder.
However, before we talk about Voder, it is important to mention the inventions that led up to Voder. The very first efforts aimed at creating synthetic speech were made about two hundred years ago by Christine Kratzenstein in 1779. Kratzenstein was a Russian Professor who developed a contraption that comprised of vibrating reeds that were similar to the human vocal tract in terms of acoustics. The contraption was able to create five long vowels artificially.
A few years down the road, Wolfgang von Kempelen – an inventor in Vienna – created a more sophisticated contraption that was modeled after different human organs responsible for making speech a possibility in 1791. The machine featured two bellows that simulated the lungs and even a vibrating reed that was modeled after vocal cords, a leather tube as the vocal tract, two nostrils, leather tongues, and lips. Von Kempelen was successful in the production of consonants along with vowels. About half a century later, Charles Wheatstone created an enhanced version of this machine that was capable of pronouncing even a few complete words.
The first device that recognized as a true speech synthesizer, however, was the VODER. VODER was the abbreviated form of Voice Operating Demonstrator, and it was developed by Homer Dudley of Bell Labs in the 1930s. The machine was complicated, to say the least. It features fourteen similar to piano keys, a bar that could be controlled by the wrist, and a foot pedal that was manipulated by the operator and enabled the machine to speak. The synthetic sound created by VODER was quite robotic and as Lisa Guernsey of the New York Times put it, sounded like ‘an alien speaking under water.’
Ben Fino-Radin of Rhizome writes, ‘Once the true voice of the machine had entered the public consciousness, it’s place and form in fictional portrayal would never be the same. After that day in 1939, we knew specifically how inhuman machined speech should sound.’ How VODER worked is also quite fascinating. Instead of boring you with the details of how it worked (you can check out the video at the end), we would share with you what Mrs Helen Harper has to say about using VODER.
Mrs Helen Harper was the central operator of the VODER during its demonstration at the 1939 New York World’s Fair. She says, ‘For example, in producing the word ‘concentration’ on the VODER, I have to form thirteen different sounds in succession and make five up and down movements of the wrist bar and vary the position of the foot pedal from three to five times according to what expression I want the VODER to give the word. And of course, all this must be done with exactly the correct timing.’
Harper practiced for a complete year before she was able to make use of VODER with such a high-precision. A total of three hundred girls were inducted into the training program; however, only thirty were able to learn the skill. A skilled operator can have VODER speak in any language, even moo like a cow or make grunting sounds like a pig.