人工智能公司为什么要不停的砸玻璃？--好英语网

Why an AI firm is busy smashing thousands of windows
人工智能公司为什么要不停的砸玻璃？

In a sound-proofed hangar on an RAF airbase just north of Cambridge, UK, Chris Mitchell and his colleagues are busy using sledgehammers to teach their computers a lesson.

在位于英国剑桥北部的英国空军基地的一个隔音机库中，克里斯·米切尔（Chris Mitchell）和他的同事正在忙着挥舞大锤给计算机"上课"。

The team has gathered thousands of window panes and doors, all of different shapes and sizes, which they then smash, one by one, recording the distinctive shattering sound of each type of glass. Sometimes they swing sledgehammers or garden spades, sometimes they throw bricks. "We completely underestimated the mess it would make,” says Mitchell. “And how tiring it would be.”

米切尔的该团队收集了数以千计的不同形状和尺寸的玻璃窗和门，然后一个接一个的敲碎，并录下每种类型玻璃破碎时的独特声音。使用的工具也不尽相同，有时他们用大锤或花园铁锹，有时他们扔砖。"我们完全低估了这件事有多麻烦，"米切尔说，"而且没想到会这么累。"

Welcome to the latest frontier of artificial intelligence. Mitchell is CEO and founder of Audio Analytic, a Cambridge-based start-up that is training a machine learning system to recognise the sound of breaking glass.

欢迎来到人工智能的最新前沿。米切尔是位于剑桥的创业公司音频分析（Audio Analytic）的首席执行官和创始人，这家公司正在训练机器学习系统识别玻璃破碎的声音。

And it’s not just glass: the company is also teaching computers to pick out other sounds that are important to humans, like smoke alarms, bawling babies and barking dogs. The idea is to build this ability to recognise sounds – without confusing a dropped glass with a smashed window, say - into smart-home systems that will alert you when an intruder breaks in or your child starts to cry.

不只是玻璃而已：该公司还在教计算机识别对人类重要的其他声音，如烟雾报警器的声音、婴儿嚎啕大哭的声音、狗叫声。他们的想法是把计算机识别声音的这种能力融入智能家居系统中，这种系统不会把玻璃杯摔碎的声音与窗户被砸碎的声音混淆起来，这样，当有人闯入你家或你的孩子开始哭泣时系统就会向你发出警报。

In the last few years, computers have become very good at understanding the world by sight. AIs are now better than humans at recognising certain objects, especially faces. But apart from speech recognition - which is at the heart of services like Apple's Siri, Google Home and Amazon’s Alexa – highly accurate sound recognition has been given little attention. Everyday noises are just background din to most machines.

在过去几年中，计算机已经非常擅长通过视觉了解世界。人工智能在识别某些物体，特别在面部识别领域，已经比人类做的更好。但语音识别领域进展缓慢。语音识别能力是苹果的Siri、谷歌Home和亚马逊的Alexa系统的核心能力。不过到目前为止高精度声音识别没有得到过多的关注。对大多数机器来说，日常噪音只是背景杂音。

Mitchell wants to change that. "What we're working on is a new field of AI that we call artificial audio intelligence,” says Mitchell. “It’s not something that has been tackled before in any meaningful sense."

米切尔想改变这一点。 "我们正在研究的是一个新的人工智能领域，我们称之为人工音频智能，"米切尔说，"这一领域尚未作出任何有意义的突破。"

Audio Analytic is part of a new wave of companies training machine learning systems to spot patterns in sounds. Uberchord, based in Berlin, is developing an AI that can help people learn to play guitar. It listens to you strum and tells you when you have your fingering wrong. Uberchord is one of several AI companies working with sound that Abbey Road Studios – one-time recording home of the Beatles – is investing in.

音频分析属于新的一批专门训练机器学习系统来识别声音模式的初创公司。总部位于柏林的Uberchord公司正在开发一个人工智能系统，它可以帮助人们学习弹吉他。它会听你的弹奏，然后告诉你在什么时候你的指法错了。除了Uberchord以外，还有几家声音领域的人工智能公司获得了阿比路录音室（Abbey Road Studios）的投资，披头士乐队曾经在这家录音室录制专辑。

Another company, Cambridge Consultants, has taught an AI to recognise different genres of piano music, like ragtime or baroque. The system, called Aficionado, was trained on just a few hundred hours of piano playing, including both professional recordings and amateur practice videos taken from YouTube. The training data was deliberately patchy, says Monty Barlow at Cambridge Consultants. “We were challenging the AI to handle the near infinite complexity of live music.”

另一家创业公司剑桥顾问（Cambridge Consultants）则教人工智能识别不同类型的钢琴音乐，如拉格泰姆音乐（ragtime）或巴洛克（baroque）。公司名为Aficionado的系统接受了几百小时钢琴乐曲的训练，包括专业唱片和来自YouTube的业余练习视频。剑桥顾问公司的蒙提·巴洛（Monty Barlow）说，我们故意使用杂乱无章的训练数据，"我们就是想看看人工智能系统是否能处理无比复杂的现场音乐。"

Aficionado’s musical chops are not just for show, however. Training the system on music – and getting it to ignore irrelevant factors such as tempo, volume or tone – turns out to be a good way to teach it to spot patterns in complex data in general, whatever it represents. Aficionado’s first task will be to identify faults in telecommunications networks.

然而，Aficionado系统的音乐才华不只是为了表演。结果证明，让系统接受音乐训练，并让它忽略不相关的因素如速度、音量或音调，是让它学习在复杂的数据中找到规律的很好的方式，无论音乐表现的内容是什么。 Aficionado系统的第一个任务是确定电信网络中的故障。

But Audio Analytic has bigger ambitions. "We want to create a taxonomy of all sounds, and that is a huge undertaking," says Mitchell. So far, the company’s software can identify breaking windows, crying babies and smoke alarms. At the Consumer Electronics Show in Las Vegas last week, they added barking dogs to their repertoire.

但是音频分析有更大的野心。"我们想要创造一个对所有声音的分类法，这是一个巨大的工程，"米切尔说。到目前为止，该公司的软件可以识别窗户被打破的声音，婴儿的哭泣声和烟雾报警器的声音。上周，在拉斯维加斯的国际消费电子展（CES）上，他们还把狗的叫声添加到他们的声音库中。

They are also working on an anomaly detector, which will pick up sounds that seem out of the ordinary - a change from the normal background hubbub - like the clatter of someone falling over. Or the hiss of a leaking water pipe. Eventually, they want to add car alarms and perhaps - for the US market - gunshots. Audio Analytic then plans to license these sound-recognition systems to makers of smart-home gadgets.

他们还在制造一个异常检测器，它会识别不正常的声音，比如正常的嘈杂的背景音发生了变化，有人摔倒的咔嗒声，或者水管泄漏的嘶嘶声。最终，他们想要加入汽车警笛，还有为美国市场准备的枪声。音频分析计划之后将把这些声音识别系统的使用许可提供给智能家居家电的制造商。

The ability to recognise different sounds matters, says Nina Bhatia, managing director of Hive, a UK-based smart thermostat and lighting company. "It is fast becoming absolutely vital for smart home technology to detect and interpret a wide range of ambient sounds, so people can respond easily and quickly to what's going on in their homes when they're not there," she says. "You could be alerted if your smoke alarm was going off while you're in a meeting at work, and not just when you're on your sofa."

英国智能恒温器和照明公司Hive的总经理尼娜·芭提雅（Nina Bhatia）说，识别不同声音的能力很重要。 "通过智能家居技术检测和解读各种各样的环境声音的重要性日益加强，这样当人们不在家时，他们就可以轻松快速地应对发生的事情，"她说，"如果你在开会时，家里的烟雾报警器响了，你就会接到警报。而不是只有当你在沙发上时，你才会获得警报。"

As well as sending an alert to your phone, such systems could also take actions by themselves. A smashed window could make the lights turn on. A baby’s cries could turn on a nightlight and make a lullaby play from a nearby speaker.

除了向您的手机发送警报之外，这些系统也可以自己采取行动。窗户被砸碎后，系统可以开灯。婴儿的哭泣声也会打开夜灯，然后附近的扬声器就会播放摇篮曲。

Indeed, Chinese electronics firm Sengled is using Audio Analytic’s technology in a smart lamp with a speaker built into its base. Other smart-home firms are building it into their devices too, including thermostats, which as they are often installed in a central position in a home, are in a good place for eavesdropping.

事实上，中国电子公司生迪智慧（Sengled）正在把音频分析公司的技术应用到内置扬声器的智能灯具的底座里。其他智能家居公司也正在将它安装到它们的设备中，包括恒温器。恒温器通常安装在家中的中心位置，是监听的好地方。

The hard part is making sure the AI correctly identifies what it hears, because false alerts could cause havoc. Yet machine learning systems are only as good as the examples they are trained on. As Mitchell puts it: "AI is bloody useless unless you have data."

难点在于要确保人工智能正确识别它听到的声音，因为错误的警报可能会造成混乱。然而机器学习系统无法超越训练时听过的例子。正如米切尔所说："除非你有数据，否则人工智能一点用都没有。"

Getting that data is hard work. "We smashed glass for weeks and weeks," says Mitchell. “Some of these windows were full floor-to-ceiling shopfront ones. Smash those and they have a chance of taking your foot or leg off as the glass comes down."

获取这些数据是件难事。"我们连续几周不停的砸玻璃，"米切尔说。"其中一些窗户是完整的落地橱窗。敲碎那些玻璃时，当玻璃摔下来时可能会砍断你的脚或腿。

To get enough recordings of crying babies, the firm worked with parents’ groups in Cambridge. To catalogue what they were recording, they then had to come up with their own lexicon to describe the different types of crying, says Mitchell. "For example, there’s a very raspy one that seems to come from the back of the throat that we called the ‘vocal cry’."

为了获得足够多的婴儿哭泣声的录音，该公司与剑桥的父母团体合作。为了编目他们录制的内容，他们当时不得不造出一些词汇来描述不同类型的哭泣声。米切尔说："例如，有一种非常声嘶力竭的哭喊，似乎是用喉咙的后面发音，我们称之为'声乐哭（vocal cry）'。"

Dogs were somewhat easier. Working with vets, they tracked down as many different breeds as they could and introduced their AI to barks from tiny Pekinese up to sofa-sized Great Danes.

狗叫声比较容易。他们与兽医合作，跟踪了尽可能多的不同品种，从小型京巴狗的叫声，到和沙发一样大的大丹犬的叫声。

To teach their system what a smoke alarm sounds like, Audio Analytic simply bought as many different models as they could online. Hundreds are now stacked on shelves in their offices. At first their AI had trouble telling the beeps of a smoke alarm from other household bleeps, such as ringing phones, alarm clocks and oven timers. So they trained it to focus not only on the alarm’s pitch and duration but on the signature gap between the beeps.

为了让他们的系统知道烟雾报警器的声音，音频分析公司直接在网上买了几乎所有型号的报警器。现在办公室的货架上堆放着数百台报警器。最初，他们的人工智能很难区分自烟雾报警器的声音和其他家电发出的哔哔声，如电话铃声、闹钟和烤箱计时器。于是，他们让人工智能不仅关注警报的音高和持续时间，还要关注哔哔声之间的独特的间隙。

But no matter how many windows you smash or smoke alarms you set off, there will always be surprises down the line. There is a parrot species that does an uncanny impression of a smoke alarm. So Audio Analytic has had to teach its system to ignore this feathery false alarm.

但无论你打碎多少玻璃，拉响多少个烟雾报警器，总会有新的惊喜。有一种鹦鹉，竟然能模仿烟雾报警器哔哔叫的声音。于是，音频分析不得不让系统忽略鸟类的假警报。

Another sound they want to teach their system to look out for is the pitch and intonation changes of aggressive human shouts - somebody threatening violence, say. This doesn’t vary much with language or culture, says Mitchell. Distinctive changes in vocal sounds come when adrenalin floods the body and affects the voice box.

他们想教系统的另一个声音是当人类发出激奋声音时的音调和语调变化，比如有人威胁要采取暴力。这不会因为语言或文化的不同而发生很大变化，米切尔说。当肾上腺素大量分泌并影响发声系统时，人的声音会发生特殊的变化。

Audio Analytic has had to put this one on hold, however. They found that the sounds of chickens and chainsaws in a neighbourhood would also trigger their aggression detector.

然而音频分析公司已经被迫将其搁置，因为他们发现小区里鸡的叫声和链锯的声音也会触发他们的警报器。

It’s a noisy world out there – but AIs are starting to listen.

这是一个嘈杂的世界——但是人工智能才刚刚开始聆听。

相关推荐