人工智能交际与聊天机器人:十个词让机器露出真相
Imagine you and an intelligent robot are both before a judge who cannot see you. The judge will guess which of you is the human, and so will live, while the other will die. Both you and the robot want to live. The judge is fair and smart. The judge says: “You must each give me one word from an English dictionary. Based on this word, I will guess who is the human.”
设想一个场景,你和一个智能机器人在一位法官面前,但法官看不到你,却要判断两者中谁是血肉之躯的人。如果法官认为你是真实的人,你就可以活下去,而另一方只能死去。你和机器人都希望能活着。法官公平而睿智。他说:“你们必须分别从英文词典中挑选一个词告诉我,我会通过你们选择的词来判断谁才是有肉身之人。”
What one word do you choose?
你会选择哪一个词呢?
Would it be some lofty spiritual concept like “soul”? Something that reflects your own tastes, like “music”? Or a base bodily function, like “fart”?
你会选择类似“灵魂”这样能够表达精神概念的词,还是选择能够表现个人品味的词,比如“音乐”?抑或是选择一个表述基本身体功能的词,如“放屁”?
This simple thought experiment may seem fanciful, but some cognitive scientists believe that its consideration can help to illuminate our basic assumptions about artificial intelligence while also revealing some surprising insights about our own minds.
这个简单的设想实验看起来可能很奇特,但一些认知科学家认为,这一设想有助于阐明我们对人工智能的基本假设,同时也揭示了一些意料之外的关于人类思想的见解。
After all, automated ‘chat bots’ and language generating machines increasingly employ artificial intelligence to hold conversations with us or write reams of text that we encounter on a daily basis. How can we tell that the customer service representative we are chatting to online, for example, is a real person or a chirpy algorithm? Or if a fictional story was churned out by a machine rather than lovingly crafted by a human writer? Communicative AI is no longer a purely theoretical prospect and we need to be prepared to deal with it.
毕竟,现越来越多的自动“聊天机器人”和自动语言生成器能依靠人工智能来与人类对话,或者编写我们每天接触到的大量文本。人们该如何判断在线聊天的客服代表是一个血肉之躯的人还是一个可笑的算法?又该如何判断一个小说故事是由一台机器编写的,而不是由一位真正的人类作家精心撰写的?人工智能交际现在已不再是一个单纯的理论前景,我们需要做好准备与其打交道。
John McCoy, one of the researchers behind the research at the Massachusetts Institute of Technology, says he was initially inspired by a casual conversation with his colleagues. They were discussing the Turing Test, first developed by British scientist Alan Turing in 1950, that aims to measure whether a machine’s intelligent behaviour is indistinguishable from that of a human.
麻省理工学院做这项研究的研究人员之一麦考伊(John McCoy)说,他的最初灵感来自与同事的闲聊。他们正在讨论由英国科学家图灵(Alan Turing)于1950年首次发明的“图灵测试”,该测试旨在判断机器的智能行为和人类智能是否能被区分开来。
In the most common formulation, each judge is given a standard chat interface. In each trial, they may be talking to a real human, or a computer chatbot powered by artificial intelligence – and the judge’s job is to guess which it is. If the chatbot manages to fool a pre-determined number of judges, it has passed the Turing Test.
最常用的一个测试方法是,我们给每个法官一个标准的聊天界面。在每次实验中,他们可能被安排与一个真人或者一个由人工智能驱动的计算机聊天机器人交谈,法官的任务是判断对方是人还是机器。如果聊天机器人成功骗取法官达到预设次数,即算通过了“图灵测试”。
“We wondered what would be the minimal version of the Turing Test that one could come up with,” explains McCoy, before speculating whether it could even be captured in a single word. “Then the question was, what were the words that people would actually say?” It was this question that would ultimately inspire a research paper, published this year in the Journal of Experimental Social Psychology.
麦考伊解释说,在推测是否可以只用一个单词来让法官作出判断之前, “我们想过‘图灵测试’中人们能够想到的最简单的词应该是什么。因此问题就是,人们实际上会说出哪些词呢?”正是受这个问题的启发,最终才会有今年发表在《实验社会心理学杂志》上的那篇研究论文。
In the first experiment, McCoy and his colleague, Tomer Ullman, asked more than 1,000 participants to answer the question above and then analysed the words they produced to find any common patterns.
在第一个实验中,麦考伊和他的同事厄尔曼(Tomer Ullman)要求1,000多名参与者回答上述问题,然后分析了参与者们提到的所有词语,试图找到常见模式。
The top 10 words, in order of popularity, were:
最受欢迎的十个词如下:
● Love (134 responses)
● 爱(134条回复)
● Compassion (33)
● 同情(33)
● Human (30)
● 人类(30)
● Please (25)
● 请(25)
● Mercy (18)
● 怜悯(18)
● Empathy (17)
● 同理心(17)
● Emotion (14)
● 情感(14)
● Robot (13)
● 机器人(13)
● Humanity (11)
● 人性(11)
● Alive (9)
● 活着(9)
“It was striking how much convergence there was between people,” says McCoy, who is now at the University of Pennsylvania. “They can choose any word they like from a standard English dictionary and yet there’s huge convergence across individuals.”
麦考伊现在宾夕法尼亚大学从事教学研究。他说:“人与人之间的趋同性着实令人惊讶。人们可以从标准的英语词典中选择他们喜欢的任何单词,但这些选择竟然存在如此强大的共性。”
Consider the word “love” – around 10% of participants chose this word over all of the other hundreds of thousands of possibilities; overall, a quarter of all the participants chose one of the top four words.
就拿“爱”这个词举例——将近10%的参与者在成千上万个可以选择的词汇中选择了这个词;总体来看,四分之一的参与者选择了排名前四的词语中的其中一个。
In terms of the general themes, they found that words conveying bodily functions (such as “poop”), faith and forgiveness (such as “mercy” or “hope”), emotion (such as “empathy”) and food (such as “banana”) were the most popular categories.
就一般主题而言,他们发现最受欢迎的词是传达身体功能(如“大便”)、信仰和宽恕(如“怜悯”或“希望”)、情感(如“同理心”)和食物(如“香蕉”)等。
McCoy and Ullman then performed a second experiment to see how other people would respond to the words generated in the first experiment. Were the most popular items really as successful at conveying a sense of humanity as the original participants had suspected? And if so, which were best?
随后,麦考伊和厄尔曼进行了第二次实验,看看其他人如何回应第一次实验中提出的词。最受欢迎的词是否真的如原始参与者所想的那样恰当地体现了血肉之躯的人的特点?如果是这样,哪些词是最恰当的?
To find out, the researchers paired the most popular words together in various combinations (such as “human” and “love”) and asked another group of participants to determine which, of the two, was most likely to have been generated by a human and which by a computer.
为了找到答案,研究人员将最受欢迎的词以各种形式(例如"人类"和"爱")组合在一起,并要求另一组参与者确定哪一组最有可能是人提供的,哪一组是计算机提供的。
As we saw in the first study, “love” turned out to one of the most successful. But of the choices available, the highest-ranking word was “poop”. It may seem surprising that faeces turns out to be a human shibboleth, but the results suggest that knowingly flouting a taboo and provoking, rather than simply describing, an emotion might be the most straightforward way of conveying your shared humanity. Other, more colourful, terms could also spring to mind.
正如我们在第一项研究中看到的那样,“爱”被证明是最恰当的词语之一。但是在可选范围内,排名最高的词竟然是“大便”。令人惊讶的是,粪便其实是一种人类禁语,但这一测试结果表明,相比简单地描述情感,明智地蔑视并挑衅禁忌可能是传达人类特点最直接的方式。人们的脑海中也会浮现更多丰富多彩的词语。
Some of the other words seen as uniquely human evoked similarly strong emotional responses that went beyond their dictionary definition. “Moist”, for example, or “please”. Others are just enjoyable to say. Try rolling “onomatopoeia” around your mouth a few times.
还有一些词语是人类才会使用以传达类似的强烈情绪反应,例如,“潮湿”或“请”,这些词语这时传达的含义已超出了词典的定义。还有些人在选词的时候很有意思,用嘴唇发出几次“拟声词”的声音。
The reason for this might be a fair reflection of the current state of AI. While bots can now write basic descriptive sentences and even intelligible short stories, they still struggle with humour and sarcasm. Humour, after all, requires a deep understanding of context and the many cultural associations that are embedded in each word.
究其原因,可能是对当前人工智能状态相对公平的反应。虽然机器现在可以编写基本的描述性句子甚至是可理解的短篇小说,但暂时还很难理解“幽默”和“讽刺”。毕竟,“幽默”需要深刻理解语境以及每个词语涵盖的各种文化意义。
Besides these whimsical speculations, McCoy suspects that this experiment could prove a useful tool to understand people’s implicit assumptions about other groups of humans. What one word would you choose to prove you are a woman, for instance? Or to prove you are French, or a socialist? In each case, the choices should reveal the qualities that we assume all group members to recognise within themselves, that may be misunderstood or ignored by outsiders.
除了这些异想天开的推测之外,麦考伊怀疑这个实验还可以作为理解人们对其他群体私下看法的有效工具。例如,你会选择哪个词来证明你是一个女人?又或者会选择哪个词来证明你是法国人还是社会主义者?无论哪种情况,你选择的词应该是只有特定群体成员内部才了解的特质,而这种特质是群体以外的人不了解甚至可能忽略的。
In the meantime, McCoy has found that the Minimal Turing Test is a useful provocation for further debate about the nature of AI. “It’s been fun to ask eminent psychologists this question, to see them think really, really hard and for them to come back hours later to excitedly change their answer,” McCoy says. “This very simple question just gets you thinking deep thoughts about the human versus the computer and how they communicate.”
与此同时,麦考伊发现这一最简单的"图灵测试"有助于挑起关于人工智能本质的争论。麦考伊说:“向知名心理学家提出这个问题,看着他们非常认真努力地思考,经过长达几小时的思考后又激动地改变他们的答案,这非常好玩。但其实这个非常简单的问题只是让你深入思考人与计算机的区别,以及人与计算机如何沟通。”
His own favourite was deceptively simple. “One of the words I liked was ‘err…’ – that was clever,” McCoy says.
麦考伊最喜欢的一个词看起来很简单。他说:“我喜欢的一个词是‘嗯……’——这很机智”。
In general, though, it is worth remembering that if you ever do need to prove yourself as a human in a world increasingly run on machines, be crude, and be funny.
但总的来说,值得记住的一点是,在这个越来越智能化的世界里,如果你确实需要证明自己是一个人,那就尽可能粗鲁一点、有趣一点。