Now we are getting to the biggest and weirdest risk of AI: a super intelligence emerging and wiping out humanity in pursuit of its own goals. To a lot of people this seems like a totally absurd idea, held only by a tiny fringe of people who appear weird and borderline culty. It seems so far out there and also so huge that most people wind up dismissing it and/or forgetting about shortly after hearing it. There is a big similarity here to the climate crisis, where the more extreme views are widely dismissed.
In case you have not encountered the argument yet, let me give a very brief summary (Nick Bostrom has an entire book on the topic and Eliezer Yudkowsky has been blogging about it for two decades, so this will be super compressed by comparison): A superintelligence when it emerges will be pursuing its own set of goals. In many imaginable scenarios, humans will be a hindrance rather than a help in accomplishing these goals. And once the superintelligence comes to that conclusion it will set about removing humans as an obstacle. Since it is a superintelligence we won’t be able to stop it and there goes humanity.
Now you might have all sorts of objections here. Such as can’t we just unplug it? Suffice it to say that the people thinking about this for some time have considered these objections already. They are pretty systematic in their thinking (so systematic that the Bostrom book is quite boring to read and I had to push myself to finish it). And in case you are still wondering why we can’t just unplug it: by the time we discover it is a superintelligence it will have spread itself across many computers and built deep and hard defenses for these. That could happen for example by manipulating humans into thinking they are building defenses for a completely different reason.
Now I am not in the camp that says this is guaranteed to happen. Personally I believe there are also good chances that a superintelligence upon emerging could be benevolent. But with existential risk one doesn’t need certainty (same is true for the other existential risks, such as the climate crisis or an asteroid strike). What matters is that there is a non zero likelihood. And that is the case for superintelligence, which means we need to proceed with caution. My book The World After Capital is all about how we can as humanity allocate more attention to these kinds of problems and opportunities.
So what are we to do? There is a petition for a 6 months research moratorium. Eliezer wrote a piece in Time pleading to shut it all down and threaten anyone who tries to build it with destruction. I understand the motivation for both of these and am glad that people are ringing loud alarm bells, but neither of these makes much sense. First, we have shown no ability to globally coordinate on other existential threats including ones that are much more obvious, so why do we think we could succeed here? Second, who wants to give government that much power over controlling core parts of computing infrastructure, such as the shipment of GPUs?
So what could we do instead? We need to accept that superintelligences will come about faster than we had previously thought and act accordingly. There is no silver bullet but there are a several initiatives that can be taken by individuals, companies and governments that can dramatically improve our odds.
The first and most important are well funded efforts to create a benign superintelligence. This requires the level of resources that only governments can command easily, although some of the richest people and companies in the world might also be able to make a difference. The key here will be to invert the approach to training that we have take so far. It is absurd to expect that you can have a good outcome when you train a model first on the web corpus and then attempt to constrain it via reinforcement learning from human feedback (RLHF). This is akin to letting a child grow up without any moral guidance along the way and then expect them to be a well behaved adult based on occasionally telling them they are doing something wrong. We have to create a large corpus of moral reasoning that can be ingested early and form the core of a superintelligence before exposing it to all the world’s output. This is a hard problem but interestingly we can use some of the models we now have to speed up the creation of such a corpus. Of course a key challenge will be what it should contain. It is for that very reason that in my book The World After Capital, I make such a big deal of living and promoting humanism, here is what I wrote (it’s an entire section from the conclusion but I think worth it)
There’s another reason for urgency in navigating the transition to the Knowledge Age: we find ourselves on the threshold of creating both transhumans and neohumans. ‘Transhumans’ are humans with capabilities enhanced through both genetic modification (for example, via CRISPR gene editing) and digital augmentation (for example, the brain-machine interface Neuralink). ‘Neohumans’ are machines with artificial general intelligence. I’m including them both here, because both can be full-fledged participants in the knowledge loop.
Both transhumans and neohumans may eventually become a form of ‘superintelligence,’ and pose a threat to humanity. The philosopher Nick Bostrom published a book on the subject, and he and other thinkers warn that a superintelligence could have catastrophic results. Rather than rehashing their arguments here, I want to pursue a different line of inquiry: what would a future superintelligence learn about humanist values from our current behavior?
As we have seen, we’re not doing terribly well on the central humanist value of critical inquiry. We’re also not treating other species well, our biggest failing in this area being industrial meat production. Here as with many other problems that humans have created, I believe the best way forward is innovation. I’m excited about lab-grown meat and plant-based meat substitutes. Improving our treatment of other species is an important way in which we can use the attention freed up by automation.
Even more important, however, is our treatment of other humans. This has two components: how we treat each other now, and how we will treat the new humans when they arrive. As for how we treat each other now, we have a long way to go. Many of my proposals are aimed at freeing humans so they can discover and pursue their personal interests and purpose, while existing education and job loop systems stand in opposition to this freedom. In particular we need to construct the Knowledge Age in a way that allows us to overcome, rather than reinforce, our biological differences which have been used as justification for so much existing discrimination and mistreatment. That will be a crucial model for transhuman and neohuman superintelligences, as they will not have our biological constraints.
Finally, how will we treat the new humans? This is a difficult question to answer because it sounds so preposterous. Should machines have human rights? If they are humans, then they clearly should. My approach to what makes humans human—the ability to create and make use of knowledge—would also apply to artificial general intelligence. Does an artificial general intelligence need to have emotions in order to qualify? Does it require consciousness? These are difficult questions to answer but we need to tackle them urgently. Since these new humans will likely share little of our biological hardware, there is no reason to expect that their emotions or consciousness should be similar to ours. As we charge ahead, this is an important area for further work. We would not want to accidentally create a large class of new humans, not recognize them, and then mistreat them.
The second are efforts to help humanity defend against an alien invasion. This may sound facetious but I am using alien invasion as a stand in for all sort of existential threats. We need much better preparation for extreme outcomes of the climate crisis, asteroid strikes, runaway epidemics, nuclear war and more. Yes we 100 percent need to invest more in avoiding these, for example through early detection of asteroids and building deflection systems, but we also need to harden our civilization.
There are a ton of different steps that can be taken here and I may write another post some time about that as this post is getting rather long. For now let me just say a key point is to decentralize our technology base much more than it is today. For example we need many more places that can make chips and ideally do so at much smaller scale than we have today.
Existential AI risk aka the Terminator scenario are real threats. Dismissing them would be a horrible mistake. But so would be seeing global government control as the answer. We need to harden our civilization and develop a benign superintelligence. To do these well we need to free up attention and further develop humanism. That’s the message of The World After Capital.