对话OpenAI Jack Clark:中国是人工智能领域的领军者

导语:Eye on A.I.是由纽约时报资深记者Craig S. Smith主持的一档双周博客节目。每一期节目,Craig都将与这一领域有影响力的人物进行交流,推进广义环境中的机器智能新发展,思考技术发展新蕴意。

机器之心为此系列对话的中文合作方。以下为此系列内容的第一篇,Craig Smith与Jack Clark就全球AI发展进行了探讨。

Hi, this is Craig Smith with a new podcast about artificial intelligence. I’ll be talking to Jack Clark, author of the popular Import AI newsletter, about what he has learned during the previous week and why it is significant. Jack is a veteran of the British technology journal, The Register, and of Bloomberg News. He now works on policy and communications for OpenAI, the nonprofit artificial intelligence research company founded by Elon Musk . So think of this podcast as a review of what’s happening in the world of AI, curated by one of its keenest observers. 

大家好,我是 Craig Smith,这是我创建的一个关于人工智能的播客。今天与我对话的是 JackClark,他是很受欢迎的Import AI 新闻订阅的作者。今天我们要谈谈他过去一周了解到了什么以及为什么这很重要。Jack是英国科技杂志 TheRegister和 Bloomberg News的资深行业老兵。他现在也为 ElonMusk资助的非营利性人工智能研究公司 OpenAI工作,致力于政策和沟通问题。所以你可以把今天的播客看作是对当前人工智能世界的概览,而且是来自一位最敏锐的观察者。

CRAIG: I wanted to start by talking about the bottom of the newsletter this week in which you talk about what was said at the World Economic Forum in Davos. The thing that made a lot of headlines and certainly jumped out at me, was [01:09] Google CEO Sundar Pichai’s comment that the rise of AI is more important than the discovery of electricity or even fire. And you see a lot of statements like that. It surprised me seeing that come from Pichai although I don’t follow him closely. Is that overstating it in your view?

Craig:首先,我想谈谈本周订阅的最后部分的内容,其中你谈到了有关达沃斯世界经济论坛上所涉及的内容。有一件事上了很多头条,当然也引起了我的关注,那就是谷歌的 CEOSundar Pichai 对AI 崛起的评论,他说这比发现电甚至火还要重要。而且你也能看到很多类似的评论。但是,这些话从 Pichai口中说出,还是挺让我惊讶的,尽管我没有密切关注过他。在你看来,这样的表述是否夸张?

JACK: It is not overstating it. And I think it’s pretty commendable that he’s talking about this head on, because a lot of executives have been trying to say that this technology has the same attributes as previous disruptive technologies like the advent of the smartphone. This is of a fundamentally different nature. And though when we hear people talk about electricity and fire, we think that’s incredibly overblown, I believe that if you think of how AI is going to play out over the next, not just few years, but few decades, we’re going to look back on it with something equivalent to that. From a historical perspective, it will suddenly look like you built computers and very soon after, the computers gained human-like capabilities. And soon after that, as we’ve seen with things like AlphaGo, that the computers quickly surpassed human experience. And that’s very unusual and I think significant. 

Jack:并不夸张。而且我觉得他能正面谈论这个问题是很值得称道的,因为有很多公司高管都一直在讲,这项技术和之前的颠覆性技术有一样的属性,比如智能手机的出现。实际上人工智能技术的本质存在根本性的不同。而且尽管当我们听到人们谈论电和火时,我们会认为他们说得实在太过夸张,但我相信如果你思考一下 AI会在未来的几十年,而不是几年,会如何发展,我们回头看时也会有一样的感觉。从历史角度看,很可能看起来就像是计算机突然就发明出来了,之后很快计算机就具有了接近人类的能力。在那之后不久,我们看到 AlphaGo这样的计算机很快超越人类的事情出现了。这是非常不同寻常的,我认为这意义重大。

CRAIG: Yes, I agree. I guess I'm a bit of a skeptic in the timeline. I can see super intelligence and the intelligence explosion happening at some point in the human history. I just am not optimistic it’s this century. But we will…


Jack: Well, it does not need to be super intelligent. I'm going to push back on that point with you now which is that, sure super intelligence, yes, that’s going to be obviously a big deal if it happens. But we don’t need for that to happen for this to be huge. 


Think about exactly how much the world was changed by the advent of the database or the spreadsheet. The ability to just have computers organize rote textual information is one of the forces that led to globalization and the information revolution. Now we have capabilities that let computers access basic human senses, like an approximation of vision, an approximation of hearing and they’re going to live in a world which is actually built for humans to see and hear things. 


So, just try to imagine about exactly how much information is becoming available right now to computers. And then think about what happened over the last 30 years with spreadsheets or databases, and I’d say to you that we don’t need this stuff to be super intelligent for it to change the world far more than the digital revolution so far has to date. And that to me is kind of mindboggling. 

所以,只要尝试想象一下现在计算机能够使用多少信息,然后再思考一下过去 30年电子表格或数据库所造成的影响,那我就能让你明白我们并不需要超级智能就能改变世界,而且改变的程度将远远超过数字革命迄今所造就的一切。对我来说,这非常让人振奋。

CRAIG: Yes, that much I certainly agree with. He talked about the need for global multi-lateral frameworks and then Theresa May put her country’s flag in the ground as a leader in developing the ethics of AI. 

Craig:是,我当然认同这一点。他谈到了对全球多边框架的需求,而且 Theresa May(英国首相)也树立了国家的旗帜,要成为 AI道德伦理方面的领导者。

Is that overoptimistic to think that [04:33] we’ll be able to come up with a multi-lateral framework that countries will adhere to given how unsuccessful we’ve been with nuclear energy technology or nuclear weapons technology? 


JACK: You and I are having this conversation and neither of us can point to a place on the earth which has had a nuclear weapon use in anger since the World War II. Again, I'm going to say to you, we built institutions that worked here because we aren’t talking about a tragedy after Hiroshima and Nagasaki. There is huge proof that the international governance system that we built around nuclear weapons works. It has its problems. You and I are speaking in 2018 and it’s a time when, I think, the clock to midnight on how far away we are from nuclear annihilation has ticked up to two minutes to midnight which is the closest it’s been in a while. But nonetheless, we persist. 

Jack:在座的我们两人都不能说出在第二次世界大战之后地球上有谁因为愤怒而使用了核武器。另外,我也要对你说,我们建立了一些有效的机构,因此我们现在才没有谈论广岛和长崎事件之后的悲剧。有大量的证据表明我们围绕核武器所构建的国际治理体系是有效的。当然也存在问题。现在是 2018年,现在我们离核毁灭结局的末日时钟仅有两分钟了,这是很长时间以来最接近毁灭的时间。但不管怎样,我们仍在坚持。

So, no, I don’t think it’s over optimistic. I do think it is optimistic and it’s optimistic especially, as someone coming from the UK, which is unmooring itself from Europe and therefore from power blocks. I do think it’s realistic to think that we’re going to see countries collaborate on various AI standards. And I think it’s more than likely that you’re going to see norms develop in the international community around AI and how it intersects with the military. 

所以,不,我并不认为这过于乐观。我确实认为这有些乐观,尤其是某个来自英国的人想成为领导者,而这个国家正在脱离欧洲,因此也将脱离一个有力量的区域。我也确实相信我们能看到各个国家合作构建各种 AI标准。而且我还相信你肯定会看到围绕 AI的国际社区开发出相应的规范,约束其在军事上的应用。

Other than a few bad actors, we rarely use things like chemical weapons and cluster bombs these days thanks to international norms. And yes, there are exceptions, but we know that we can do this stuff, it’s just immensely difficult. But that doesn’t mean we should shy away from the problem. 


CRAIG: Yes. I guess the thing that’s been said about AI is unlike a cluster bomb, it’s much more difficult to detect when it’s being used. As far as Theresa May’s claim that the UK will be a leader in AI, from my point of view, I spent much of my career in China and I see China as the clear leader going forward simply because they benefit from a centrally commanded economy and they have a huge state sector that doesn’t worry about cost. They’ve got the largest population and are spinning out the most engineers. 

Craig:是的。但我想人工智能终究不同于集束炸弹,在使用时被检测到的难度要大得多。对于 Theresa May 宣称英国将成为 AI领域领导者的言论,就我看来,因为我的工作很多都在中国,我觉得中国才是显而易见的领导者。原因很简单,因为他们有一个由中央掌控的经济模式,得益于此,他们有大量不在乎成本的国有企业和机构。他们拥有最多的人口,并且正在培养最多的工程师。

Do you think that countries like the UK have a chance to play a leadership role? And do you think that countries like China will engage in the kind of frameworks that the West is going to want to control this stuff?


JACK: You’re handing me a very complicated question there. I'm going to split this open into a couple of things. Do I think countries like the UK can have a role here? I'm skeptical. I'm skeptical because we know that AI requires you to be the pinnacle of scientific and technological development from a national perspective. So, though the UK has some AI advantages, I'd argue a huge amount of that is because of investment into the UK from American companies. Facebook and Google and Microsoft. The startup ecosystem in the UK is somewhat contingent on that. 

Jack:这个问题真的非常复杂。我会将其分成几个小问题来说。英国这样的国家有机会吗?我表示怀疑。因为我们知道 AI需要从国家层面占领科学和技术发展的制高点。尽管英国有一些 AI优势,但我认为很大一部分优势都源自 Facebook、谷歌和微软等美国公司对英国的投资。英国的创业生态环境也在某种程度上取决于此。

So, Prime Minister May is giving the impression that this is entirely domestically developed, but if you look at the enabling factors, it actually comes from somewhere else where there’s a larger amount of money swilling around in it and the proportionally larger scientific and technological culture. 


Now, do I think that larger countries like the US or China or India or Russia, even, by virtue of some of its specific investments, can have a role in international AI? Yes, that seems likely. I think that we see the emergence of what one could term power blocks here. 

接下来的问题是,我觉得美国、中国、印度或俄罗斯等更大的国家,甚至通过其特定的投资,能够成为国际 AI发展的领军者吗?是的,很有可能。我认为我们将看到力量核心的兴起。

Now to your question of norms, will these various power blocks respect the norms which the majority tries to set? Mostly, no. Mostly, people are going to do what they’ve always done which they try to grow their own economies and compete in the international game of geopolitical brinksmanship. But, and here’s the difference and you alluded to this earlier, AI has the potential for great harm and so, there are incentives here for collaboration around certain norms relating to how one wields and uses AI, which even if you have radically different ideological s towards how you see yourself competing on the global stage, you have a pretty big incentive to want to be able to have certain confidences about your competitors or opponents and how they’ll use this powerful technology. 

现在回到关于规范的问题,这些不同的力量核心会尊重大多数人试图设立的这些规范吗?基本上不会。基本上而言,人们会继续做他们一直都在做的事情,也就是发展自己的经济,在地缘政治边缘政策的国际博弈中互相竞争。但是,仍会有不同之处,你之前也提到过,AI有可能造成很大的伤害,所以这会激励人们围绕特定的规范进行合作,这就涉及到了人们开发和使用 AI的方式。即使你对自己如何参与国际舞台上的竞争有自己的看法,你也会希望能够对你的竞争者或对手以及他们使用这项强大技术的方式具有一定的信心。

So, I do think that countries like China as you mentioned will have an incentive to conform to some norms. I do think that those norms are going to be relatively few in number though. 


CRAIG: Yes, we talked about this before, the question of whether or not China is publishing everything that they do in open forums. My guess is that there’s a lot - and I think you agreed - that there’s a lot going on in China that no one sees. That’s certainly of some concern. 


[10:32] Something else you write about in the newsletter is a student in Florida who used computer vision and deep learning to create a live overview of a popular multiplayer online video game, League of Legends. What struck me about that is the relative ease and low expense, both in dollars and computational time, with which this guy managed to come up with a solution. 


It also highlighted to me how much data there is out there to play with if you know how to scrape it because he was using just videos of gameplay. Beyond that I'm not sure I can see the usefulness of what he’s done except for that world of gaming. But that struck me that this student in Florida with very little money and with publicly available datasets come up with a solution. I would guess we’re going to start seeing more and more of that.


JACK: Yes, I think you’ve hit on a really interesting point which is, why is it interesting? Well, it’s not interesting that they were able to build a cool doohickey for an eSport, for a computer game that you and most people will not have heard of. 


What is cool is that this doohickey allows them to access information about the game which the company could expose by the application programming interface of the game to developers but have chosen not to. So, in a sense, the significance of this is they were simply able to point this powerful technology at a bunch of data that they were able to scrape from the game. In doing so, the deep learning algorithms essentially improvised the underlying API which the company didn't want the person to have access to. 

真正炫酷的是这个小道具让他们可以获取有关游戏的信息——游戏公司可以通过提供给开发者的游戏应用程序接口来提供,但却没有选择这么做。所以,从某种程度上讲,这个项目的重要意义在于他们只是证明可以使用这种强大的技术来处理从游戏中收集到的数据。这么做本质上是让深度学习算法临时创造了公司不希望人们访问的底层 API。

So, when we think about what this means, it really calls into question a lot of stuff about how proprietary software works. Because the assumption with proprietary software is I can sell you some software because you’re going to get some value from it and you’re not going to be able to clone the software I sell you purely from looking at the inputs and outputs of the system. 


What this gives us a flavor of is that, with lots of AI technologies, all I need to know is the inputs and outputs and I can improvise the rest. That calls into question a lot of how we think of IP protection and IP based businesses working in modern era. 

而我们现在可以感受到,使用很多 AI技术,我只需要输入和输出就能构建出其它部分。这不禁会让我们思考这个时代的 IP保护方式和基于 IP的商业业务。

CRAIG: Yes, you’ve got this student in Florida that’s able to come up with this pretty interesting solution based on open source software and publicly available data. Then you have these powerful tools, like Detectron that people like Facebook are making publicly available. [13:35] Is the day coming when, this sort of democratization of AI, when it escapes the big heavily funded labs and becomes a garage production thing, where people are doing things out of their basements? I mean doing significant things out of their basements. 

Craig:确实,佛罗里达的这个学生只用开源软件和公开可用的数据就创造了这个出色又有趣的方案。这样的工具现在有很多,比如 Facebook就公开提供了 Detectron。你觉得,从重金支持的实验室到车库就能生产的东西,这种 AI的民主化快要来临了吗?我的意思是人们能在家中自己动手就做出重大的成果吗?

JACK: No, sadly. I could explain why, or I don’t know if my response is just going to depress you so much you want to stop this endeavor. Would you like me to explain why I think that’s not the case?


CRAIG: I would be fascinated, yes. 


JACK: What you see with these kinds of things is what I term a computational dividend from these large companies. They’ve spent a huge amount of money on electricity bills to develop some system that has a capability. In this case, a research platform that can let me draw bounding boxes around the world. So fine, that represents the basic commoditizing effect of technology R&D. It means that people in garages around the world are now going to be able to access this basic sense. 


In the same way that it’s a commodity currently to be able to point a webcam at the world and have it offer a label as to what it thinks the most prominent thing in the image is, where stuff like Detectron, or stuff like the League of Legends thing we’re speaking about. You can point your camera at something specific and have it produce a load of information about that. Or you can point it at a specific domain and get it to tell you something useful. 


But none of those things really represent the cutting edge. The cutting edge is going to exist on far larger computers than any start-up can possibly hope to wield or any person in the garage. It’s going to require research techniques which have not yet appeared in research papers. The difference in capability is going to be profound. 


In AI now, every six months, the world changes. It used to be every few years, the world changes. And before that it used to be every decade the world changes. The technological epochs are multiplying and the intervals between them are reducing. So, the competitive moat that people like Facebook and others have is getting deeper over time because they’re able to wield larger and larger models. I’ll give you a very, sort of, tangible example of this. 


I as a start-up can have Detectron. I as a start-up can have a residual network, or a highway network, or some kind of advanced deep neural network system. I can even have the data. It’s still going to take me a period of time to train a model based off of my own data to do something useful. The training time is going to be conditioned on the amount of computers I can access. I don’t know how large your garage is, but it’s definitely not as big as a football field I'd wager. Whereas, Facebook has a football field worth of computers. What that means is that when Facebook wants to do something involving R&D, it can do it far faster than anyone in the garage. 

如果我有一家创业公司,我可以使用 Detectron,我也可以使用残差网络、highway 网络或其它的先进深度神经网络系统。我甚至能获得数据。但如果要做出一些有用的东西,我仍然需要使用自己的数据来训练一个模型,这需要一定的时间。训练时间的长短取决于我能够使用的计算机。我不知道你的车库有多大,但我敢说肯定不会有足球场那么大。然而,Facebook的计算机能摆满足球场。这意味着当 Facebook想要做点研发时,开发速度会超过任何在车库中搞开发的人。

So, if Facebook’s ability to empirically experiment and discover new AI techniques is proportional to the amount of computers Facebook has, it’s very hard to see a world in which start-ups can really easily compete with these AI giants because they simply don’t have a large enough computer to be able to experiment as rapidly as them despite having benefits of the computational dividends from the companies. 

所以,如果 Facebook的实验能力和发现新 AI技术的能力与其所能使用的计算机数量成比例,那么这个世界上就很难有创业公司能轻松与这些 AI巨头竞争,因为他们根本没有足够大的计算机来帮助他们足够快地完成实验。就算创业公司能从这些公司取得一些计算红利,也难以与之抗衡。

CRAIG: Well, maybe it’s the computational dividends though that will finance or make possible smaller applications that the big guys aren’t interested in pursuing or aren’t spending time thinking about. 


You wrote about DroNet, a joint project between universities in Switzerland and Spain to train drones to fly along city streets. They used publicly available data from self-driving cars and created their own dataset with bicycles. That to me was fascinating and it was, again, something that was created without huge investment and based on open source technology with, I think there is a Parrot drone, which is a cheap consumer drone. Isn’t the computational dividend there in these applications that are going to start happening on a more localized level? 

你还写到了DroNet,这是一个瑞士和西班牙的大学之间的一个联合项目,目的是训练能在城市街道上飞行的无人机。他们使用了来自自动驾驶汽车的公开可用数据并且使用了自行车创建了自己的数据集。我觉得这也是个不错的项目,同样没有用到巨额投资,同样基于开源技术。我想他们用的是 Parrot无人机,一种较便宜的消费级无人机。这些应用中的计算红利是否会开始在更局部的层面上发生?

JACK: Yes, in the sense that you’re going to see some innovation at the edges. Sure, absolutely. The dataset which they trained that drone from, there were two parts to it. 


One was a bicycle, so they didn't have a good dataset of collisions. So, they got a bicycle, strapped a GoPro or something to it and just simply pedaled with intent towards obstacles and tried to simulate collisions. I was quite glad to read in the paper that they didn't actually just generate tens of thousands of collisions. I would have felt very sorry for the researcher who had to do that. 

一是自行车。他们没有优良的撞车数据集。所以他们给一辆自行车绑上了一个 GoPro或类似设备,然后故意让它撞到障碍物来模拟撞车情形。我很高兴能在他们的论文读到这一点,因为他们并不是简单地直接生成数以万计的撞车事件。对于不得不去做这件事的研究者,我表示同情。

So, what they did is they accelerated towards it, deliberately slows down and then labelled that as something that the drone should avoid. So, sure, they generated some of their own data, but you’ll notice that the really strategic data which is the actual car driving data, the bit which tells the drones how to steer, how to follow the road, that comes from Udacity. That comes from a dataset generated by the Udacity online education course for self-driving cars. 

他们实际上做的是先加速冲向障碍物,然后故意减速,并将障碍物标记成无人机应该避开的东西。所以当然他们生成了一些自己的数据,但你也会注意到实际上至关重要的数据是真实的汽车驾驶数据,这些数据能告诉无人机如何转向、如何按道路行驶。这些数据来自 Udacity。是 Udacity为自动驾驶汽车在线教育课程所生成的数据集。

So, the Udacity online education course for self-driving cars is a course run by Sebastian Thrun who helped design Google’s self-driving car. Udacity itself is funded by tens of millions of dollars of venture capital. And Sebastian Thrun knew how to create that dataset after spending four or five years working in secret at Google on its self-driving car project. None of that suggest to me that Google is any less ahead as a consequence of this. 

Udacity自动驾驶汽车在线教育课程由 Sebastian Thrun 主持,他曾帮助设计了谷歌的自动驾驶汽车。Udacity本身得到了风险资本的数千万美元投资。而且在谷歌的自动驾驶汽车项目上秘密工作了四五年后,Sebastian Thrun 也知道如何创建这样的数据集。但这些都不能让我信服地表明谷歌失去了任何一点领先优势。

In fact, it suggests the opposite for what we’re seeing here is late stage research projects made possible by investments made many years ago by companies which are somewhat opaque to us and whose capabilities aren’t super obvious. Yes, it creates a capability that’s interesting, like drones that can do some useful stuff. But imagine how much better that drone would be if it was trained off of Google’s giant internal self-driving car dataset. 

事实上,刚好相反,我们现在看到的都已经是后期的研究项目,是那些公司很多年前的投资所产生的成果。这些公司的项目对我们而言都有一定程度上的不透明,能力如何也并不非常明显。是的,DroNet  确实有些值得关注的能力,比如能做些有用事情的无人机。但想象一下,如果用谷歌那巨大的内部自动驾驶汽车数据集来训练无人机无人机的性能会好多少?

I think that can give you an idea here for how, say, Waymo at a drop of a hat now, Waymo knows it can train a far more capable drone navigation model than that which appeared in this paper because rather than having 70,000 images, it’ll have 700 million. 

我觉得现在你已经明白,Waymo知道自己立马就能训练出一个比那篇论文好很多的无人机导航模型,因为他们拥有的图像远不止 70000张,而是 7亿张。

I just want to push on this point, I am struggling to see evidence of the possibility of effective competition in the AI ecosystem. No one has shown me evidence to convince me that start-ups have an easy life here. In fact, all of the evidence I see says the opposite. 


It says that start-ups are either using the dividends, so the lag investment of big companies, or they are competing in territories where we know the big companies have innate advantages and could simply exercise those advantages and run over the start-ups. 


CRAIG: Are cases like DeepLeague and DroNet more examples of bright young engineers who will eventually go to work for Google, Facebook, Uber, Amazon or one of the big guys? 

Craig:这是否说明聪明的年轻工程师基本最终都会为谷歌、Facebook、Uber 或亚马逊等巨头工作?

JACK: Yes. As far as I can work out, this shows us that it’s easy to contribute to AI, it’s easy to develop AI in the open using tools which are being given away to you by these larger companies. But I haven’t seen an AI start-up emerge which can truly show a capability that exists in excess of anything you see from either top tier academic research institutions, or far more frequently, the large companies. 

Jack:是的,在我看来是这样。这向我们表明使用开放的工具开发 AI并做出贡献是很容易的,而很多工具都是由这些大公司提供的。我还没看到任何一家新兴的创业公司具备真正超越顶级学术研究机构的能力,更别说超过那些大公司了。

[22:43] To me AI is going to challenge a lot of our notions of anti-trust and a lot of our notions of how competition works in technology. Because the moats seem to grow deeper over time even though you’re able to release a huge amount of it as open source or in the open via research papers. It’s very paradoxical and it’s going to mean that if you’re a regulator, all of the symptoms are of a healthy market. You have some start-ups, you have lots of open innovation, you have lots of sharing. 


But then when you prod at the fundamentals, look at where the money is coming from, who has the strategic asset needed to do business like data or compute, or who has the talent, you find that it’s actually just the very large companies which have a vice-like grip on the fundamentals of what we think of as the ingredient to competition. 


CRAIG: Yes. The stress is on companies, not on nation states. 


JACK: We can't view this in terms of nation states yet because we don’t really have the manifestation of coherent nation states research agendas yet. 


[24:00] When you look at organizations in countries, in China, you see a very tight loop between government, the private sector and the public sector. None of it is what I'd quite call yet Chinese research. It’s more just what happens if you’re trying to catch up with people like Google, Microsoft and Amazon is you play every card you have. 


In the case of China that’s using huge amounts of data, huge amounts of built-in commercial aspects like your companies and also the ability of government and funding to make things move a bit quicker and accelerate stuff. But there is no research paper that I'd say has a particular Chinese flavor yet. Nor would I say that there’s research I read where I say, “Oh well that feels like a very Belgium-like AI paper or German AI paper.” 


I don’t think we’re quite there yet. These power blocks are going to manifest, but they’ll manifest under the purview of the implementation of national AI research agendas. Really the only country even doing national AI research is China and it’s only just initiated that. So, we need to wait about three or four years before we can see the fruits of those initial investments and intents on its path. 

我认为我们还没到这些力量核心能够体现出来的时候,但从国家 AI研究计划的实现角度看,它们将会得到体现。实际上中国是唯一一个在国家层面上推动 AI研究的国家,而且它才刚刚开始。所以,我们还需要等上三四年时间才能看到这些初期投资的成果以及他们所选道路的目的。

CRAIG: [25:29] That brings us to PsychLab which I really did get excited about. To me that was the beginning of testing an AI against human intelligence in psychometrics. Can you talk a little bit about that?

Craig:这让我想起了 PsychLab,我觉得这个项目很激动人心。我想这应该是心理计量学(psychometrics)方面对比人类智能测试 AI的开始。你能谈谈这方面吗?

JACK: Yes. PsychLab is a deep learning and reinforcement learning testing suite from DeepMind. What PschyLab does is it makes it possible for us to test and evaluate reinforcement learning algorithms for not just their performance but how well they do on human psychological tests. What this means is you can now test an AI algorithm on a cognitive science test and you can also have a human go and run the same test and you can compare their performance. 

Jack:PsychLab 是一个来自 DeepMind的深度学习强化学习测试套件。PsychLab不仅能让我们测试和评估强化学习算法的性能,还能让我们了解它们在人类心理学测试上的表现。这意味着现在你可以基于认知科学测试 AI算法,你还可以让人类来进行同样的测试,然后比较他们的结果。

The most interesting thing to me is that PsychLab let the scientists at DeepMind surface some pretty meaningful data points regarding the relative performance and drawbacks of their algorithms. In one case, they performed a test where they ask the reinforcement learning algorithm they are testing which was from a system that they call Unreal, which is one of their best performing and most highly tuned systems. 

在我看来,最有意思的是 PsychLab 能让 DeepMind的科学家知晓某些与他们的算法的相对表现和缺陷相关的非常有价值的数据点。在一个案例中,他们执行了一项测试——询问他们所测试的来自一个名叫 Unreal的系统的强化算法,哪个系统是表现最好的以及得到了最好调节的。

They asked it to perform a very basic test in cognitive science which is to do with looking at different sets of concentric circles. It’s called, the shattered glass problem. And working out which one has the largest amount of concentricity in it. What they discovered is that the Unreal agents perform very, very badly relative to humans. They were able to hypothesize that perhaps it performed badly because the way that the gaze or the vision system of these Unreal agents works is rather too different to how human vision works. 

他们让它执行了一个认知科学中非常基础的测试,即查看不同集合的同心圆。这被称为碎玻璃问题(shattered glass problem)。目标是确定哪一组具有最大的同心度。他们发现,相比于人类,Unreal智能体的表现非常差。他们给出了假设,也许其表现差的原因是这些 Unreal智能体的注视和视觉系统的工作方式与人类视觉的工作方式存在非常大的差异。

So, what they ended up doing is implementing a vision system which was heavily based on what we call foveal vision, which is how in the center if your eyeball, Craig, you have more receptors relative to the periphery. This is what allows you to focus on objects that are close and to have higher resolution on things at the center of your vision. 

所以,最后他们实现了一个严重依赖于所谓的中央凹视觉(foveal vision)的视觉系统,即你的眼球中央所具有的感受器多于周边时的视觉。这能让你重点关注近处的物体,并且你视觉中心的事物在你眼中的分辨率也更高。

With a traditional convolutional neural network, the receptor field is going to be uniformed across all of it. What they did is that they bunched up the receptor field of a neural network in the same way you do it with a human eyeball, foveal vision. What they found is that not only were they able to create an agent that started to pass these tests with closer to human-like performance, but they’re able to take that very same agent and place it in a reinforcement learning environment called laser tag, which was not tested anywhere else in this testing suite and is actually much more like you having to play a virtual game of paintball with your friends, or laser tag. They found that suddenly, they had created unreal agents which were able to excel in this environment at a level of performance they really hadn’t been able to achieve before. 

使用传统的卷积神经网络,感受野会在所有位置均匀分布。他们所做的是将神经网络的感受野进行聚焦,正如你的眼球所实现的中央凹视觉一样。他们发现,他们不仅能创造出能以接近人类的水平通过这些测试的智能体,而且还能将完全一样的智能体放入被称为激光标签(laser tag)的强化学习环境中,这没有在这个测试套件的任何地方测试过,实际上更像是你与朋友玩的虚拟彩弹游戏。他们发现突然之间他们就创造出了擅长应对这一环境的 Unreal智能体,并且达到了他们之前从未达到过的水平。

So, what that tells you is that by doing this psychological or cognitive neuroscience testing on their agents, they were able to actually find a drawback, fix the drawback and pass the test. Then take that agent out of the standard testing regime of PsychLab and still show good performance, and in some cases, exceptional performance on wholly new tasks that hadn’t been tested on before. 

这说明,通过在他们的智能体上执行这样的心理学或认知神经科学测试,他们实际上找到了一个缺陷,并且修复了这个缺陷并使之通过了测试。然后他们将这个智能体用在了 PsychLab 的标准测试方案之外,也仍然表现优良。在某些案例中,其在之前从未见过的全新任务上也取得了突出的表现。

That to me is just so exciting because it gives us a whole new way of thinking about how we can stress test these algorithms and how scientists can have another tool in their experimentation kit to help them diagnose things about it. 


CRAIG: I was also fascinated by that foveal vision fix, if you want to call it that. One of the things that I found interesting is where the gaps in the abilities of both the unreal agent and the human matched in that glass pattern test. Where one dot is white and the other is black, neither the human nor the AI agent could perform, could recognize the pattern. To me that was fascinating. So, there’s something in the cognitive processing of that that breaks down in both the human and the artificial agent. 

Craig:这种中央凹视觉修复确实让人着迷。另一件事我觉得也很有意思,即 Unreal智能体与人类在玻璃图案测试(glass pattern test)中所表现出的差异。其中一个点是白色,另一个点是黑色,人类和 AI智能体都不能有效地识别这种图案。我觉得这个问题很让人着迷。所以,在认知处理中,存在某些让人类和人工智能体都束手无策的东西。

The other thing that really struck me in the analysis, I think in their paper they talked about how the convolutional neural networks can process features in parallel on a GPU. But humans are primarily restricted to serial processing of visual data. When they asked rhetorically why evolution didn't give humans such an ability as you would presume that it’s an advantage. 

另一个让我印象深刻的是其分析思路。在他们的论文中,他们谈到了卷积神经网络在 GPU上并行处理特征的方式。但人类对视觉数据的处理方式主要是串行处理。他们提出了一个问题:为什么进化没有为人类提供这种看起来更具优势的能力呢?

But then they notice that the subjective experience of serial processing feels very much like the essence of thought. Because you think in a sequential stream, which, to me that was fascinating that you can start to identify features in human mental processing that are different from an artificial agent’s processing, t hat might explain some of our subjective experience. I don’t know if you picked up on that, but that was fascinating to me. 


JACK: Yes, I think if AI continues to develop, we are going to learn very surprising things about our own biases and proclivities. Because we’re going to be able to hold up a cognitive system to a mirror.

Jack:我认为如果 AI继续发展,我们将会了解到有关我们自身偏见和偏好的非常让人惊讶的事情。因为我们能够创造一个能反映我们自身的认知系统,就像一面镜子。

When you look in the mirror in the morning or the evening, you see yourself, but you don’t really have a way of seeing the mirror to how your brain thinks you should act in the world. Because there’s no mirror that can simulate another person. We just look at other people and model ourselves on them. 


What these AI systems and PsychLab shows us is, as we get more creative and advance agents, perhaps we can have something that looks like a cognitive mirror where we can look at a different way of thinking and dealing with the same task, and in doing so, learn more about ourselves. That to me is so exciting. 

这些 AI系统和 PsychLab 向我们表明,随着我们的智能体越来越有创造力和先进,也许我们可以创造出类似认知之镜的东西,我们可以从其中看到处理同一任务时不同的思维和处理方式,这样我们也能更理解我们自己。我觉得这很激动人心。

产业OpenAIJack Clark计算机视觉深度学习自动驾驶无人机强化学习

深度学习(deep learning)是机器学习的分支,是一种试图使用包含复杂结构或由多重非线性变换构成的多个处理层对数据进行高层抽象的算法。 深度学习是机器学习中一种基于对数据进行表征学习的算法,至今已有数种深度学习框架,如卷积神经网络和深度置信网络和递归神经网络等已被应用在计算机视觉、语音识别、自然语言处理、音频识别与生物信息学等领域并获取了极好的效果。


从 20 世纪 80 年代首次成功演示以来(Dickmanns & Mysliwetz (1992); Dickmanns & Graefe (1988); Thorpe et al. (1988)),自动驾驶汽车领域已经取得了巨大进展。尽管有了这些进展,但在任意复杂环境中实现完全自动驾驶导航仍被认为还需要数十年的发展。原因有两个:首先,在复杂的动态环境中运行的自动驾驶系统需要人工智能归纳不可预测的情境,从而进行实时推论。第二,信息性决策需要准确的感知,目前大部分已有的计算机视觉系统有一定的错误率,这是自动驾驶导航所无法接受的。




在学术研究领域,人工智能通常指能够感知周围环境并采取行动以实现最优的可能结果的智能体(intelligent agent)


神经科学,又称神经生物学,是专门研究神经系统的结构、功能、发育、演化、遗传学、生物化学、生理学、药理学及病理学的一门科学。对行为及学习的研究都是神经科学的分支。 对人脑研究是个跨领域的范畴,当中涉及分子层面、细胞层面、神经小组、大型神经系统,如视觉神经系统、脑干、脑皮层。


阿尔法围棋是于2014年开始由英国伦敦Google DeepMind公司开发的人工智能围棋程序。AlphaGo是第一个打败人类职业棋手的计算机程序,也是第一个打败围棋世界冠军的计算机程序,可以说是历史上最强的棋手。 技术上来说,AlphaGo的算法结合了机器学习(machine learning)和树搜索(tree search)技术,并使用了大量的人类、电脑的对弈来进行训练。AlphaGo使用蒙特卡洛树搜索(MCTS:Monte-Carlo Tree Search),以价值网络(value network)和策略网络(policy network)为指导,其中价值网络用于预测游戏的胜利者,策略网络用于选择下一步行动。价值网络和策略网络都是使用深度神经网络技术实现的,神经网络的输入是经过预处理的围棋面板的描述(description of Go board)。






(人工)神经网络是一种起源于 20 世纪 50 年代的监督式机器学习模型,那时候研究者构想了「感知器(perceptron)」的想法。这一领域的研究者通常被称为「联结主义者(Connectionist)」,因为这种模型模拟了人脑的功能。神经网络模型通常是通过反向传播算法应用梯度下降训练的。目前神经网络有两大主要类型,它们都是前馈神经网络:卷积神经网络(CNN)和循环神经网络(RNN),其中 RNN 又包含长短期记忆(LSTM)、门控循环单元(GRU)等等。深度学习是一种主要应用于神经网络帮助其取得更好结果的技术。尽管神经网络主要用于监督学习,但也有一些为无监督学习设计的变体,比如自动编码器和生成对抗网络(GAN)。


卷积神经网路(Convolutional Neural Network, CNN)是一种前馈神经网络,它的人工神经元可以响应一部分覆盖范围内的周围单元,对于大型图像处理有出色表现。卷积神经网路由一个或多个卷积层和顶端的全连通层(对应经典的神经网路)组成,同时也包括关联权重和池化层(pooling layer)。这一结构使得卷积神经网路能够利用输入数据的二维结构。与其他深度学习结构相比,卷积神经网路在图像和语音识别方面能够给出更好的结果。这一模型也可以使用反向传播算法进行训练。相比较其他深度、前馈神经网路,卷积神经网路需要考量的参数更少,使之成为一种颇具吸引力的深度学习结构。 卷积网络是一种专门用于处理具有已知的、网格状拓扑的数据的神经网络。例如时间序列数据,它可以被认为是以一定时间间隔采样的一维网格,又如图像数据,其可以被认为是二维像素网格。


无人机(Uncrewed vehicle、Unmanned vehicle、Drone)或称无人载具是一种无搭载人员的载具。通常使用遥控、导引或自动驾驶来控制。可在科学研究、军事、休闲娱乐用途上使用。


强化学习是一种试错方法,其目标是让软件智能体在特定环境中能够采取回报最大化的行为。强化学习在马尔可夫决策过程环境中主要使用的技术是动态规划(Dynamic Programming)。流行的强化学习方法包括自适应动态规划(ADP)、时间差分(TD)学习、状态-动作-回报-状态-动作(SARSA)算法、Q 学习、深度强化学习(DQN);其应用包括下棋类游戏、机器人控制和工作调度等。