The power to create life has been long considered the domain of gods. Yet ever since the early days of civilization, humanity has yearned to tap into this power. From magic, to Frankenstein, to the evermore life-like robots and AI, the attempts to create someone, or at least, something like us persists through history and literature. Have we finally achieved this goal with technology? Does humanity now possess this ability? Or is it still a fantasy?
In the spirit of April (Fools), let us see how far down is AI along the path of “fooling” humans.
Starting with the Turing Test
In 1950, Alan Turing published what is now widely considered to be one of the most influential papers of that era. In it, he discussed the possibility of creating a machine with real intelligence. But because he realized it was rather difficult to define “intelligence”, he came up with the now-famous Turing Test: If a machine can communicate with a human being (via some device) without the human realizing it is a machine, then this machine is deemed to possess intelligence. This simple definition of intelligence enabled Turing to convince the world that a “machine capable of thought” is possible. In his paper, Turing also responded to the most common questions raised against his hypothesis. As the result, the Turing Test became the first philosophical hypothesis of Artificial Intelligence.
The Turing Test: We have test subject A (a machine), and test subject B (a normal person). Let a person C ask the two test subjects a series of questions in a language that both test subjects understand. If after several rounds of questioning, C cannot distinguish any material difference between A and B, then machine A passes the Turing Test.
But in recent years, the effectiveness of the Turing Test has come under doubt. Its problem? It is too easy to cheat the Turing Test: All the machine needs to do is to lie or feign ignorance, and it can easily pass the test. Just as Turing predicted, the key to passing the test is to avoid answering the question. For example, if a judge asked the machine “Do you have feelings?”, the machine must lie to pass the test. And this deception is not an exception, but a norm. The only way to pass the Turing Test is through deception.
Therefore, researchers have proposed numerous improved version of the Turing Test, or have come up with their own testing method, such expanding the test to include more types of questions or incorporating tests for creativity. In 2015, an article on the website io9 listed 8 different methods that could replace the classic Turing Test: The Winograd Schema Challenge which tests logic, Lovelace Test 2.0 which tests creativity, Construction Challenge which tests the structural capacity, and Visual Turing Test which tests visual capabilities etc. Not so long ago, psychologist and cognitive scientist Gary Marcus published an article on Scientific American, proposing new ways to test AI, an example being standardized tests used in school today.
Software that might have passed the Turing Test
Although facing ever increasing calls to replace it, for many, the Turing Test is still considered the holy grail of AI technology. Researchers started claiming their software has passed the Turing Test shortly after its emergence and acceptance.
The earliest example of this was in 1966, by a program named ELIZA from the famous computer scientist Joseph Weizenbaum. It could analyze the user’s input and look for keywords. Once it finds a keyword, it then generates a response based on a pre-defined set of rules. If it cannot find any keywords, it will then generate a general response based on a previous input. Weizenbaum also replicated a Rogerian therapy method on ELIZA, allowing it to freely pretend that it knows nothing about the world. By using these techniques and (in a sense) deception, ELIZA was successful in convincing some people that they are conversing with a real person. Some even said “it is hard to believe that ELIZA is not a human”. Thus, ELIZA is probably the first software program to pass the Turing Test.
Conversing with ELIZA
In 1972, American psychiatrist Kenneth Colby, who was dedicated in applying the theory and application of computer science and AI to psychiatry, created a program named PARRY, which was nicknamed “ELIZA with attitude”. At a technical level, PARRY was very similar to Weizenbaum’s ELIZA. Except PARRY was imitating the behavior of a person with paranoid schizophrenia. For its Turing Test, a team of experienced psychiatrists talked to PARRY and a real patient through a teletype. The transcript of this conversation was also shown to 33 other psychiatrists. Both sets of psychiatrists were asked to determine which patient was human, and which was a machine. Their response had an accuracy of 48%, the equivalent of guessing at random. Hence, PARRY passed the Turing Test as well, and it could be considered the founder of the “Act Crazy Method” to fool humans.
After entering the 21st century, more advanced technology and more powerful devices brought forth even more powerful ways to fool humans.
In 2014, University of Reading, which hosted Turing Test 2014, announced that Vladimir Veselov’s AI program, Eugene Goostman, passed the Turing Test. For many, they consider this a milestone event. Eugene Goostman was a program that imitated a 13-year-old boy. That year, there were 5 supercomputers, including Eugene Goostman, competing. And Eugene Goostman convinced 33% of the judges that it was human. The pass threshold for the Turing Test is 30%.
In 2016, some researchers even claimed they’ve discovered a trick that will allow any program to pass the Turing Test: keeping silent. University of Coventry researchers Kevin Warwick and Huma Shah discussed their findings in the paper “Taking the fifth amendment in Turing’s imitation game”. They found that if a machine “took the fifth” aka kept silent during a Turing Test, they have implicitly passed the test, as they can be considered a being with its own thought.
But, are these “intelligent programs” truly intelligent? We don’t think so. As Synced has summarized before, currently, most of the programs that passed the Turing Test have done so by using 4 types of deception:
- Using short phrases or sentences to give a general answer.
- Analyze the judge’s cultural background to provide relatable answers.
- Respond to questions by counter-questioning. For example, if a judge asks “Are you from Russia?”, then the counter-question strategy would respond with “why are you not sure I’m from Russia?” This method of challenging the judge often yields great results. The counter-questioning strategy was originally a therapy method in psychiatry counseling, but has since been widely adopted in Turing Test games.
- When human’s responses are too specific, they often will be mistaken for a machine. Participants take advantage of this fact, by inducing the judge to ask for specific details, then providing a general answer. In this situation, the real person’s answer would appear to be more “machine-like”.
The AI’s Ability to fool humans is getting stronger
However, aside from the aforementioned methods to cheat the Turing Test, AI technologies has made real and substantial advances in the path of fooling humans. With the rise of deep learning and its many algorithms, computers have made astounding achievements in creating fakes that are convincingly real. Sometimes even surpassing the ability of human counterfeiters. Since not only can machines imitate well, it can do so with surprising speeds. Below, we will introduce some domains where advances in AI has created outputs indistinguishable from that of humans. Just for fun, we will also throw in some examples to test whether you can distinguish the “real” from the “fake”. The answers will be posted at the end of this article. How many will you guess correctly?
1. Who’s Talking?
If your still think robotic speech is “mechanical and cold”, then you probably have never heard Google Assistant speak. Its crisp vocals sound as if it came out of a news announcer. But Google is not the only company with breakthrough voice synthesis technology. Last September, DeepMind announced their deep generative model WaveNet decreased the inferred difference between synthesized voice and human voice by over 50%. This year, a team led by Yoshua Bengio came out with a new end-to-end voice synthesis technique called Char2Wav, with the ability to generate audio files directly from text. Even more recently, Baidu’s proposed Deep Voice model is a high speed, high quality voice-to-text system that is entirely based on deep neural nets. The newest technology in this field, Google’s Tacotron, is able to take character inputs and output the corresponding spectrum map to generate audio through the Gfriffin-Lim reconstruction algorithm, thus achieving a breakthrough in the speed of generation.
Aside from voice synthesis, machines are also hard at work, learning how to synthesize other types of sounds. Last June, researchers from the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) published a novel sound simulation algorithm, and claimed it is realistic enough to fool humans. In the abstract of their paper, they wrote “We present an algorithm that synthesizes sound from silent videos of people hitting and scratching objects with a drumstick. This algorithm uses a recurrent neural network to predict sound features from videos and then produces a waveform from these features with an example-based synthesis procedure. ” In testing, twice as many people chose the simulated sound than real sounds. Especially for materials such as leaves and dirt where the sound is not distinct. They have truly achieved “more realistic than real”. Here, have a look yourself.
2. Listen, who’s singing?
AI can also create music! Last May, Google Research scientist Douglas Eck introduced his team’s latest efforts in training AI composer aides to the music fans at Moogfest: software that can generate chords, riffs, and melodies in songs. He believes that one day, machines can compose a song entirely by itself. This is Google’s Project Megenta, which also won the award for Best Demo at NIPS 2016. This year, from April 18 to 21, Google will take Magenta back to Moogfest. For those who are interested, you can follow them at http://suo.im/2bFF2y
Can you tell whether this piece of music is generated by Magenta or not?
A notable mention is Chu Hang, a Ph.D. student from the University of Toronto. Last Christmas, he published a song composed entirely by an AI software looking at a Christmas tree, along with a music video of a dancing stickman and lyrics written and vocalized by the AI software as well. He was able to do this by first creating a hierarchy RNN model, and then feeding large amounts of musical data for it to analyze and learn the general structure of music. After summarizing the common features for songs with similar melodies, he created a multi-layer neural network model through a novel framework. The end product was able to generate a music whose theme corresponds to an input image. In a multi-layer framework, it could also generate novel dance steps and vocals.
Also, the aforementioned WaveNet and MIT’s algorithm can also be used to generate music.
3. Style Transfer: Counterfeiting the works of master painters
Ever since the image process app Prisma took off overnight, image style transfer technology instantly went from being leading edge laboratory technology to an everyday app in everyone’s phone, allowing people without “artistic talent” to create beautiful artwork with a simple press of a button.
Last October, Google published a paper introducing a simple method to allow a single style transfer convolution neural network to learn multiple styles. Not only could it be applied to still images, it was valid for videos as well. By November, Facebook took it one step further, by realizing real-time style transfer on mobile devices.
But with over a decade of history, style transfer is by no means a new concept. However, the application of neural nets to style transfer is indeed a new approach, initially proposed by a paper titled “A Neural Algorithm for Artistic Style” from October of 2015. Other than transferring style between images, some researchers have applied this technique to numerous other fields, such as font design. Last November, Synced published an article on Flipboard software engineering Yuchen Tian’s Project Rewrite, where he utilized neural nets to design novel fonts for Chinese characters.
Below are two images of the White House, one being a painter, and another being a picture process by Prisma. Can you tell which one is which?
4. Machine Poets
Using machines to write poetry is not a novel idea. famous Chinese sci-fi author Cixin Liu (“The Three Body Problem”) once program a “electronic poet” software capable of writing “modern poetry”. But its size is less than 1MB, so it probably didn’t use neural nets. And its creations are probably only properly structured, yet lacking in theme. Now, with NLP technologies helping AI analyze and understand language, AI is becoming better and better at poetry. A while ago, Baidu NLP published a special article on Synced, introducing its research results on poetry generation.
However, even people consider poetry as a difficult art, let alone machines. Even though compared to people, machines have certain natural advantages such as an infinite word/character database, and machines can easily solve the rhythm and rhyme required for properly poetic flow. But real poetry has “soul”, where it conveys the thoughts and ideas of the poet. This is what machine generated poems lack: control – it is difficult for it to create a poem focusing on one central idea. But, can machines convey an idea to humans through poetry, and let people feel its soul?
For our readers well versed in Chinese, one of the poems below is written by Ge Shaoti from the Song Dynasty, and the other generated by AI. Can you tell which one is which?
Artificial Intelligence is going further and further down the path of fooling humans. From voice synthesis, to image synthesis, to simulating human conversations, AI is approaching or even surpassing human capabilities in many areas. Currently, companion devices and chat bots that can replace humans to a certain extent are starting to emerge. This, combined with the advances in virtual reality, augmented reality, and mixed reality technologies, future AI technologies may not only be limited to imitating human voices and images, but rather create a realistic world on a large scale – provided our world is real in the first place.
Finally, the answers you’ve all been waiting for:
● Audio 1 is actually a clip from “My Town” by the band Buck O’Nine, not the works of a machine. For those interested in Magenta’s creation, you can find it in the related readings.
● Image 1 is the works of artist Hall Gorat II
● Image 2 is generated by Prisma’s Caribbean style
● Poem 1 (left) is machine generated
● Poem 2 (right) is written by Ge Shaoti from the Song Dynasty
Header image showed that crowds were impressed by VODER demonstrated on 1939 New York World Fair, the first attempt to electronically synthesize human speech.
Author: Wu Pan| Localized by Synced Global Team: Xiang Chen