Cross-posted from my website on s-risks. Articles on the FRI blog reflect the opinions of individual researchers and not necessarily of FRI as a whole, nor have they necessarily been vetted by other team members. For more background, see our post on launching the FRI blog.
Traditional disaster risk prevention has a concept of risk factors. These factors are not risks in and of themselves, but they increase either the probability or the magnitude of a risk. For instance, inadequate governance structures do not cause a specific disaster, but if a disaster strikes it may impede an effective response, thus increasing the damage.
Rather than considering individual scenarios of how s-risks could occur, which tends to be highly speculative, this post instead looks at risk factors – i.e. factors that would make s-risks more likely or more severe.
Advanced technological capabilities
The simplest risk factor is the capacity of human civilisation to create astronomical amounts of suffering in the first place. This is arguably only possible with advanced technology. In particular, if space colonisation becomes technically and economically viable, then human civilisation will likely expand throughout the universe. This would multiply the number of sentient beings (assuming that the universe is currently not populated) and thus also potentially multiply the amount of suffering. By contrast, the amount of suffering is limited if humanity never expands into space. (To clarify, I’m not saying that advanced technology or space colonisation is bad per se, as they also have significant potential upsides. It just raises the stakes – with greater power comes greater responsibility.)
Nick Bostrom likens the development of new technologies to drawing balls from an urn that contains some black balls, i.e. technologies that would make it far easier to cause massive destruction or even human extinction. Similarly, some technologies might make it far easier to instantiate a lot of suffering, or might give agents new reasons to do so.1 A concrete example of such a technology is the ability to run simulations that are detailed enough to contain (potentially suffering) digital minds.
Lack of efforts to prevent s-risks
It is plausible that most s-risks can be averted at least in principle – that is, given sufficient will to do so. Therefore, s-risks are far more likely to occur in worlds without adequate efforts to prevent s-risks. This could happen for three main reasons:
- Inadequate risk awareness: Humanity tends to deal with risks “as they arise”, rather than anticipating and addressing potential risks far ahead of time. This approach is often sufficient in practice, and may also work out for many possible s-risks, but it is plausible that some s-risks require a high degree of foresight and precautionary action – that is, they can no longer be prevented at the point where the suffering actually happens. For instance, if our civilisation eventually builds powerful autonomous AI systems, it is crucial that we think carefully about potential failure modes and install worst-case safety measures such as surrogate goals. Clearly, s-risks are far more likely if the relevant actors aren’t aware of them.
- Strong competitive pressure: An arms race dynamic may create bad incentives to skimp on safety measures in favor of faster development of technical capabilities, and might in the worst lead to escalating conflicts between competitors. S-risks – as well as other risks – would be significantly more likely in this case, compared to a world where successful coordination makes it possible to address potential risks without fear of losing out.
- Indifference: It is also possible that powerful actors are aware of s-risks and would be able to avert them, but they simply do not care enough. In particular, a narrow moral circle may result in a disregard for s-risks that affect nonhuman animals or artificial sentience.2
Inadequate security and law enforcement
Human civilisation contains many different actors with a vast range of goals, including some actors that are, for one reason or another, motivated to cause harm to others. Assuming that this will remain the case in the future3, a third risk factor is inadequate security against bad actors. The worst case is a complete breakdown of the rule of law and associated institutions to enforce them. But even if that does not happen, the capacity for preventive policing – stopping rogue actors from causing harm – may be limited, e.g. because the means of surveillance and interception are not sufficiently reliable. In particular, if powerful autonomous AI agents are widespread in future society, it is unclear how (and if) adequate policing of these agents can be established.
(In his paper on the Vulnerable World Hypothesis, Nick Bostrom refers to this as the semi-anarchic default condition; he argues that human society is currently in that state and that it is important to exit this condition by establishing effective global governance and preventive policing. The paper is mostly about preventing existential risks, but large parts of the analysis are transferable to s-risk prevention.)
Put differently, military applications of future technological advances will change the offense-defense balance4, possibly in a way that makes s-risks more likely. A common concern is that strong offensive capabilities would enable a safe first strike, undermining global stability. However, when it comes to s-risks in particular, I think tipping the balance in favor of strong defense is also dangerous, and may even be a bigger concern than strong offensive capabilities. This is because actors can no longer be deterred from bad actions if they enjoy strong defensive advantages.5 In a scenario of non-overlapping spheres of influence and strong defense, security in terms of preventing an invasion of one’s own “territory” is adequate, but security in terms of preventing actors from creating disvalue within their territory is inadequate.
Polarisation and divergence of values
S-risks are also more likely if future actors endorse strongly differing value systems that have little or nothing in common, or might even be directly opposed to each other. This holds especially if combined with a high degree of polarisation and no understanding for other perspectives – it is also possible that different value systems still tolerate each other, e.g. because of moral uncertainty.
This constitutes a risk factor for several reasons:
- Powerful factions might just ride roughshod over the concerns of others, including moral concerns. So, it is quite possible that efforts to prevent s-risks would be dismissed.
- It would be difficult to reach a compromise in this situation, making it impossible to realise potential gains from trade, including low-cost measures to avoid suffering.
- There is an increased risk of worst-case outcomes resulting from escalating conflicts between different factions.
- Strong divergence in values increases the risk that some actors will have very bad values, including values that might want to intentionally harm others out of hatred, sadism, or vengeance for (real or alleged) harm caused by others.
Interactions between these factors
I am most worried about worlds where many risk factors concur. I’d guess that a world where all four factors materialise is more than four times as bad as a world where only one risk factor occurs (i.e. overall risk scales super-linearly in individual factors). This is because the impact of a single risk factor can, at least to some extent, be compensated if other aspects of future society work well:
- Suppose that there are advanced technological capabilities and few people care about preventing suffering, but political dynamics implement a fair compromise between values (i.e. the “polarisation” and “inadequate security” risk factors are absent). I think that even low levels of concern for suffering could go a long way in this case, as the government or similar institutions try to take everyone’s concerns on board. This is because it is unlikely that s-risks are unavoidable, and because better technology might make it easier to avoid incidental suffering (e.g. comparable to better methods of stunning animals before slaughter).
- The absence of advanced technology capabilities limits the possible amount of suffering, even if there is polarisation, inadequate security, and a lack of efforts to prevent s-risks.
- The enforcement of some basic rules, such as not causing direct and deliberate harm to others (or threatening to cause harm), would plausibly help prevent the worst outcomes even in worlds with polarised values and inadequate efforts to prevent s-risks.
- If a lot of influential and thoughtful people work to avert s-risks, chances are that they will find a way to prevent the worst, even if circumstances are difficult along other dimensions.
Further research could consider the following questions for each of the risk factors:
- What are the best practical ways to make this risk factor less likely?
- How likely is it that this risk factor will materialise in future society?
- How will this risk factor change due to the emergence of powerful artificial intelligence or other transformative future technologies?
This post was inspired by a comment by Max Daniel on the concept of risk factors. I’d also like to thank Max Daniel, David Althaus, Lukas Gloor, Jonas Vollmer and Ashwin Acharya for valuable comments on a draft of this post.
- However, there is an important difference: For s-risks, what matters is what happens after a steady state and technological maturity is reached (assuming that this happens), so any possible technology exists at that point. For x-risks, the way there, e.g. the order in which technologies are developed, matters most. (back)
- This holds for incidental and natural s-risks, but it’s unclear for agential s-risks. (See A typology of s-risks.) This is because more compassion may also lead to an increase in agential s-risks. (back)
- It is possible that a singleton will emerge, e.g. in the form of unified AGI, in which case this risk factor may not be relevant. However, the foreseeable future contains many actors with many different goals, and I think it’s not-too-unlikely that this will persist for a very long time. Plus, even if a singleton eventually emerges, it will be shaped by agents in the earlier multipolar world. (back)
- Looking at human history, the balance of offense and defense in military technology, i.e. how hard it is to physically attack someone, changed significantly over time. In the stone age, you could just ambush the neighbouring tribe and kill everybody; but warfare in the middle age featured strong defensive capabilities as you had to engage in a protracted siege to conquer someone's castle. Modern warfare has presumably tilted the balance back towards offensive capabilities. (I'm not an expert on military history – these are just vague impressions.) (back)
- I think the scenario of strong defensive capabilities is not unlikely. Imagine an outcome where our civilisation colonises the universe but there are multiple loci of power. In conflicts within this intergalactic multipolar civilisation, it might be difficult to physically attack the other party simply because of astronomical distances between galaxies or superclusters. (Acausal interactions between AIs in the multiverse also involve non-overlapping spheres of influence where it’s impossible to physically attack the other party.) (back)