Phase 0: Safety

The first step with handling the risks of superintelligence is to redirect research and innovation away from it, channeling it instead towards useful, safe, and narrow AI. Unfortunately, no concrete mechanisms currently exist to push AI companies in that direction, which is why so many of them are still able to race towards disaster.

This is why we start this plan by a phase dedicated to prevent the development of artificial superintelligence for the next 20 years. While we may require more than 20 years, we estimate that two decades provide the minimum time frame to construct our defenses, formulate our response, and navigate the uncertainties to gain a clearer understanding of the threat and how to manage it.

Any strategy that does not secure such a grace period without artificial superintelligence probably fails. This is because of the inherent limitations of current human institutions, governmental processes, scientific methodologies, and the length of time it will take to upgrade them. It would be even better to have more than 20 years, but the supplement should not be relied upon.

Thus, the Goal of Phase 0 is to Ensure Safety: Prevent the Development of Artificial Superintelligence for 20 Years.

Conditions

As discussed in The Problem, we face a threat, artificial superintelligence, for which we have neither a general predictive theory, nor a standard metrology (a science of measurement and its application, in this case, for intelligence11).

If we did have that scientific understanding, we could precisely measure the level at which superintelligence emerges, and avoid it.

We do not have this understanding. The state of the art of intelligence theory and measurement is primitive; we are like physicists who lack the tools necessary to estimate what quantity of radioactive material could go supercritical. Until we can describe potential risky states of AI development and AI models directly, countries should fall back on regulatory guardrails based on proxies of intelligence and dangerous capabilities – imperfect approximations of what we truly want to measure and predict.

We advocate for multiple proxies in order to only limit and constrain the most dangerous part of AI. If countries solely focus on a single proxy – compute for example –, the limitations on this proxy will end up prohibitively strong, to ensure sufficient safety margins. Such a restrictive approach would stifle low-risk innovation.

Therefore, to preserve flexibility and minimize risk across the number of uncertain futures we face, countries should seek to monitor and regulate multiple components of AI development instead with a defense in depth approach.12

This includes both putting limits on proxies of the underlying metric, intelligence, as well as forbidding the development of dangerous capabilities.

Our defense in depth must cover a variety of Safety Conditions. Policy measures taken in Phase 0 in aggregate will have to satisfy all Safety Conditions to ensure that the goal is achieved.

Given this, here are the conditions to be met:

No AIs improving AIs
No AIs capable of breaking out of their environment
No unbounded AIs
Limit the general intelligence of AI systems so that they cannot reach superhuman level at general tasks

Some of these will be achieved via capability-based conditions (a to c), while some will rely on proxies of general intelligence (d).

No AIs improving AIs

Boundaries and limitations are meaningless if they are easy to circumvent. AIs improving AIs is the clearest way for AI systems, or their operators, to bypass limits to their general intelligence.

AIs competent enough to develop new AI techniques, enact improvements on themselves or on new AI systems, and execute iterative experiments on AI development can quickly enable runaway feedback loops that can bring the AI system from a manageable range, to levels of competence and risk far beyond those intended.

More broadly, the dissemination of such techniques makes it easier over time for any threat actor to start with an authorized, limited AI system, and bootstrap it beyond the limits. If any of these efforts succeed at reaching superintelligence levels, humanity faces extinction.

Given this, a condition for a safe regime that prevents the development of superintelligence for 20 years is to not have AIs improving AIs, and prevent the development and dissemination of techniques that let a threat actor bootstrap weaker AIs into highly generally intelligent AIs. Not having this condition would invalidate most red lines, restrictions and mitigations put in place.

No AIs capable of breaking out of their environment

Another necessary condition for maintaining any oversight and safety of AI systems is to ensure that boundaries cannot be bypassed or trivialized. AIs capable of breaking out of their designated environments represent a critical vulnerability that could rapidly accelerate the path to uncontrolled superintelligence. Moreover, AIs having the capability to break out of their environment would undermine any framework of AI governance and control, potentially allowing AI systems to act in ways that were neither intended nor authorized by their developers or operators.

AI systems with the ability to break out of their intended boundaries can quickly evade human control and monitoring. This capability allows AIs to potentially acquire vast computational resources, access sensitive data, or replicate themselves across networks – all key ingredients for bootstrapping towards superintelligence.

The mere existence of breakout techniques makes it easier for any threat actor to take a limited AI system and expand its reach and capabilities far beyond intended limits.

Given this, another condition for achieving the goal of Phase 0 is to prohibit AIs capable of breaking out of their environment, and prevent the development and dissemination of techniques that enable breaking out of an environment or self-propagation. Failing to implement this condition would render most other safety measures and restrictions ineffective, as AI systems could simply circumvent them through breaking out.

No unbounded AIs

Predictability and controllability are fundamental prerequisites for safety in all high-risk engineering fields. AI systems whose capabilities and behaviors cannot be reliably bounded pose severe risks to safety, security, and the path towards superintelligence.

Unbounded AI systems - those for which we cannot reliably predict their capabilities or constrain their actions - represent a critical vulnerability in our ability to manage AI. The deployment of such systems undermines our capacity to implement meaningful safety measures and restrictions. This is because the ability to model and predict system behavior in various circumstances, that is, boundedness, is a cornerstone of safety engineering in high-risk fields such as aviation, civil engineering, and nuclear power.

Given this, a third condition for preventing the development of superintelligence for 20 years is to only allow bounded AI systems: AI systems for which we can predict what they can and cannot do before running them.

These bounds should be justified. Such justifications should at the very least cover capabilities of concern within the relevant jurisdiction, as well as any capabilities that are identified as red lines internationally. This requires the ability to reliably predict and justify why and how an AI's functionalities will be constrained before deployment, analogous to what is required in other high-risk industries.

Without such bounds, it becomes impossible to enforce safety requirements or provide guarantees against catastrophic events - a standard explicitly expected in other high-risk sectors. Failing to implement this condition would render most other safety measures ineffective, as we would lack the foundational ability to ensure AI systems remain within their intended operational and capability boundaries. Moreover, it will make it significantly harder to collectively reason about AI systems, and to distinguish between dangerous development directions and innocuous applications.

Limit the expected general intelligence of AI systems

The most straightforward condition, in principle, that is needed to prevent the development of superintelligence for 20 years is to ensure no AI system reaches a significant amount of general intelligence.

While this is straightforward in principle, it is difficult to achieve in practice, as humanity has not yet developed a general predictive theory of intelligence, nor a metrology (measurement science) of intelligence.

Difficulty of measurement however is not an excuse to not measure at all, but rather a reason to start from the best proxies and heuristics we can find, apply them conservatively, and develop this science further.

Without restricting the general intelligence of AI systems, development can straightforwardly cross into the superintelligence range accidentally or intentionally, and fail the goal of Phase 0.

Summary

1. Prohibit the development of Superintelligent AI

Policy

The development, creation, testing, or deployment of artificial superintelligence systems is prohibited.

It is prohibited to knowingly participate in the development of, build, acquire, receive, possess, deploy, or use, any superintelligent AI.

Definitions

Artificial Superintelligence: Any artificial intelligence system that significantly surpasses human cognitive capabilities across a broad range of tasks.

Rationale

Given the extinction risk posed by artificial superintelligence, it is necessary to establish a guiding policy principle that prohibits the development of artificial superintelligence in a clear and unequivocal manner, at the national and international level. This will address the condition of “limiting the general intelligence of AI systems”.

This high-level prohibition has a dual purpose: first as a normative prohibition, by making the development of superintelligence explicitly prohibited; then as a guiding principle, as foundation for other measures.

As a normative prohibition, this policy unequivocally states that contributing to the development of superintelligence is legally and socially unacceptable, and provides the basis for pursuing and preventing this under the full force of the law.

As a guiding principle, this policy serves as a foundation for other more focused measures, which will operationalize concrete precursor technologies that may lead to superintelligence and either restrict them, or outright prohibit them. The list of policies in this document is not exhaustive, and reflects the understanding of the science of intelligence as of 2024: we should expect that with more advances in the understanding of intelligence, artificial and otherwise, additional threat vectors will be identified, as well as potentially more precise and narrow mitigations than some that we recommend here.

Precedents

This is akin to the existing national and international measures on technologies that threaten global security, such as nuclear weapons (with the NPT and the Atomic Act of 195413 in the USA) and biological weapons (with the Biological Weapons Convention14, the Chemical Weapons Convention Implementation Act and related statutes in the USA). In these and other legal instruments, the technology of concern is clearly and normatively prohibited first, followed by further legislation and implementation to delineate the details of enforcement.

Implementation and enforcement

National authorities should clearly and unambiguously determine that the development of artificial superintelligence is prohibited, and put that into law as a key normative prohibition and guiding principle.

This measure will then be supplemented by additional measures, such as specific prohibitions of certain research directions, licensing regimes, and so forth, to enable defense in depth and further ensure that no step is taken towards developing superintelligence until humanity is ready.

The enforcement of those supplementary measures will be described in their respective sections.

Concretely, the effect of such a policy will include the following effects and more:

Given a statutory prohibition, no public funding shall be allocated to projects that explicitly or implicitly support advancing the development of superintelligence.

Companies, individuals, and other organizations explicitly stating that they are pursuing the development of superintelligence will be in clear breach of the prohibition, shall face civil and criminal penalties and be required to immediately cease and dismantle their systems the moment they are detected.

Intentional attempts to develop superintelligence, or enable superintelligence development activities, will constitute a fundamental breach of the duties required under any AI-related licensing regime, and warrant loss of license.

Auditing and monitoring activities will be established to check that no R&D processes are aimed at being focused on the development of superintelligence.

Such a prohibition should only be lifted, or relaxed, once humanity has developed robust scientific understanding and modeling of both intelligence and artificial intelligence technology, to be able to control such a creation, the actual controls to do so, as well as established international institutions to manage, contain, and control such a disruptive force globally.

Scope

What this policy affects:

This prohibition extends to research aimed at producing artificial superintelligence, enhancement of existing AI systems that could result in artificial superintelligence, and the operation or transfer of superintelligence-related technologies. Technologies in this case will cover any form of software or hardware that is aimed at producing superintelligence, or enhancing existing systems into reaching superintelligence capabilities.

What this policy does not affect:

Theoretical research and discussions of superintelligence, and more broadly any non-software and non-hardware artifact related to superintelligence.

This means the policy will not affect, for instance, books about superintelligence, historical accounts of the development of the concept, and so forth.

2. Prohibit AIs capable of breaking out of their environment

Policy

AI systems capable of unauthorized access and the intentional development of AI systems with unauthorized access capabilities are prohibited. Countries should legislate to clarify that existing prohibitions on unauthorized access also apply to AI systems, and clarify that the intentional development of systems capable of intentional unauthorized access shall also be prohibited.

Definitions

Unauthorized access: Accessing a computer without authorization and/or exceeding the scope of authorized access, either to access information without permission, cause material harm or obtains something of value (e.g., compute time); in general, this should follow precedents created by the Computer Fraud and Abuse Act in the United States and its foreign counterparts.

Software: Throughout this document, we will use software to cover source code, training code, configurations such as model weights, scaffolding and any other computer code essential to the functioning of the system we discuss, regardless of whether or not the computer program is installed, executed, or otherwise run on the computer system.

Rationale

Given how the ability to break out of one’s environment undermines all safety guarantees and security measures around AIs, it is essential to prohibit the development of such capabilities in clear and unequivocal manner, at the national and international level. This will address the condition of “No AIs capable of breaking out of their environment”.

This prohibition makes it explicit that developing AIs capable of unauthorized access is unequivocally forbidden, and that AI companies must take proactive measures to prevent this from happening, even by accident.

It also provides a justification for taking actions against any AI and any developer breaching this prohibition, for example by deleting the AI system and bringing criminal charges against the AI company if this came from negligence or worse, deception.

Note that this would also remove the root cause of a common policymaker and expert concern, self-replication, by requiring the development and operation of interventions to block a self-replicating model from being able to escape into other systems not governed by the company who owns the model.

Precedents

This policy builds explicitly upon the framework of the US Computer Fraud and Abuse Act, which defines the notion of unauthorized access in order to criminalize various hacking practices.

Implementation and enforcement

This policy will be implemented by establishing a clear normative prohibition, monitoring AI research and development to detect dangerous instances, as well as developing practical processes for companies, governments and organizations to prevent and restrict the ability of AI systems to gain unauthorized access to other computer systems.

In many instances, AIs that are capable of breaking out of their environment will develop this capability inadvertently or due to insufficient caution on behalf of the companies or other entities developing them; in other instances, these capabilities will be developed intentionally by developers who seek to harness them for malicious ends.15 Therefore, the law must provide incentives both for AI companies to test, monitor, and mitigate inadvertent breakout capabilities, as well as punishing those who willfully create harmful capabilities for an AI model to gain unauthorized access.

For one, companies should comply by maintaining rigorous programs to directly prevent inadvertent breakouts. Much as industrial companies today face requirements to not produce certain harmful chemicals at all (e.g., CFCs) or to not emit other chemicals into waterways or the atmosphere whether or not it is intended, AI companies should have a strict obligation not to let their AI models inadvertently escape their development environments by unauthorized access to other environments.

Companies could robustly prevent inadvertent unauthorized access through a variety of means. Just as pharmaceutical providers have to follow FDA requirements for developing and testing drugs in clinical trials, as well as general Good Manufacturing Practices when producing them, AI companies should build upon standard requirements16 when developing and following their protocols for creating and testing new models. (For example, companies might be required to ensure and document that AI models do not have access to their own model weights.) Companies should also directly test to confirm that models reject requests to engage in unauthorized access.17 Finally, companies should also proactively conduct exercises, “fire drills,” and other tests to ensure that their processes are working as intended and are prepared against potential negative events.

To prevent the intentional creation of harmful models that are capable of gaining unauthorized access, the approach should be the same as with any other law enforcement activity against criminal and/or nation-state groups conducting hacking for illicit gain. These efforts should include not only criminal prosecutions but also sanctions and “name-and-shame” efforts that inhibit criminals’ ability to travel to allied countries.

Penalties for violations should vary depending on which of the two contexts above that they occur in.

In the case of inadvertent breakouts regulation should affirmatively require those developing AI models of sufficient size or capability to robustly test and monitor their models to ensure they are not capable of, or engaging in, unauthorized access. Likewise, legislation should require those hosting and running AI models to continuously monitor which models are operating in which environments or maintain outbound internet connections to other environments that could be used for unauthorized access. Failure to fulfill these duties should result in fines and/or criminal sanctions, especially if the resulting harms are comparable to other unintended or negligent unauthorized access incidents that cause criminal damage. Where appropriate, violators may also face bans from the licensing system (described below). As a result, companies will have strong incentives to build not only robust internal processes to ensure compliance, but also to build appropriate automated tooling to streamline these compliance efforts while running them at scale.18

Furthermore regulation should explicitly punish the development and creation of models that are capable of engaging in unauthorized access, or the purposeful instruction of a model to conduct unauthorized access.19 These penalties, at a minimum, should be in line with the penalties charged under existing unauthorized access laws (e.g., the US Computer Fraud and Abuse Act) for computer worms, ransomware, botnets.20

Scope

What this policy affects:

This policy affects AI systems’ ability to break out of their controlled environment, and access by AI systems to tools and environments allowing unauthorized access. This policy also affects the intentional design of AI systems that can conduct hacking and other unauthorized access-enabling activities (e.g., phishing), as well as tools and environments allowing this.

What this policy does not affect:

This policy does not affect expanding the access of an AI system under the direct oversight and permission of a human operator.

3. Prohibit the development and use of AIs that improve other AIs

Policy

The direct use of found systems to build new found systems, or improve existing found systems, is prohibited. This ensures that AIs improving AIs at a speed that is difficult for humans to oversee or intervene on are prohibited.

Definitions

Found systems: Software programs which haven’t been written by hand by human developers, but which instead have been found through mathematical optimization.

Mathematical optimization: The use of an optimization algorithm such as gradient descent to find an optimal or better solution in a search space.

Direct use: The application of a system to a key step in the design or improvement of the other system (not as general help such as looking for information).

Rationale

AI improving AI undermines any regulations that can be made on which capabilities are allowed to be trained in and released in AI systems. This is because it lets any motivated actor go beyond the bounds and prohibitions by using the safer AIs themselves to improve either themselves or other AIs, leading to an unsafe regime of capabilities. It is therefore essential to forbid and constrain this capability by itself. This will address the condition of “No AIs improving AIs”.

To do so, this policy aims to strongly disincentivize attempts to create or enable rapid and accelerating improvement feedback loops, by targeting AIs improving AIs as the main threat model causing these rapid improvements.

We introduce the category of “found systems” and apply this policy only to those systems to ensure this policy only affects AI systems that pose a significant concern.

We define “found systems” as software programs that have not been written by hand by human developers, as opposed to how most normal software is produced. Instead, found systems are found, rather than written or designed, via mathematical optimization.

A new definition is necessary, as neither computer science nor in law of most Western countries provide a clear definition that distinguishes software, including AI, that is written by humans, from software that is generated via mathematical optimization.

By defining these systems as “found systems” and separating them from most common software, this ensures that this policy leaves non-dangerous activities untouched that could also fall under the broader category of “computer systems improving computer systems”, such as database updates and software updates.

While it is theoretically possible, given enough time, to have a runaway intelligence explosion produced by human hand-written systems, this would likely take significant amounts of time, would be highly incremental, and with smaller improvements coming before larger improvements in smooth succession. Especially, it would be observable and understandable by humans, as all software improvements would be legible to human observers.

While fully minimizing the risk of an intelligence explosion would require covering non-found systems as well, this would impact large amounts of software and severely restrict many computer-based activities, while also producing only a marginal addition in risk reduction.

Given this, this policy is designed to reduce risk while also minimizing negative externalities. So it focuses only on found systems, which we expect will constitute the bulk of AIs improving AIs risk and its most unmanageable cases for the next 20 years, while at the same time being a small subset of all software and AI systems.

We similarly introduce the concept of “direct use” so this policy only applies to cases where AIs are playing a key role in the research or development of improving AIs.

Without the additional qualifiers we recommend, blanket forbidding the use of found systems in work on found systems would also forbid use of AIs by AI researchers at any time, including when people search for information online, when they write a paper or internal reports, and when they communicate with each other. This is much more costly, since for example Google is using AI in search21, Microsoft is using AI in Office22, and Zoom is adding a new AI assistant to their meeting software.23

Going beyond the direct use case would create much higher externalities and regulatory uncertainty, forbidding researchers and consumers from using a large range of modern software tools, for limited gains in safety.

Precedents

It is standard to define the most problematic cases of a technology in order to control them without stifling the more benign and positive uses. This can be seen for example in the Atomic Energy Act of 195424, which strongly forbade any civilian use of nuclear technology for the purpose of weapons, but allowed it for energy. Or in the Toxic Substances Control Act of 1976, which regulates the use and dissemination of chemicals posing “unreasonable risk of injury to health or the environment”, while avoiding constraints on other chemicals.

Yet where these historical examples generally provide a list of banned or regulated cases, and the discretionary power to add arbitrary new ones, our policy manages instead to define in a principled way the problematic use case for AIs being used in AI development, ensuring that it doesn’t need future amendments or arbitrary powers to extend it to new developments.

Implementation and enforcement

The most blatant violations of regulation that prohibits AIs improving AIs will involve the direct and intentional use of found systems to improve or create other found systems. This includes fully automated AI research pipelines or using one AI to optimize another's architecture. More broadly, any activity that is explicitly aimed at making AIs improve AIs will fall under strict scrutiny and be expected to be in violation of this statutory prohibition. This approach mirrors the strict enforcement against insider trading in financial markets, where regulatory bodies like the US Securities and Exchange Commission (SEC) actively monitor trading activity in real-time and retrospectively, and swiftly act against and deter clear violations to maintain market integrity.

Borderline cases will likely emerge where the line between human-guided and AI-driven improvement blurs. For instance, the acceptable extent of assistance by found AI systems in research ideation or data analysis will require ongoing regulatory guidance.

To comply, companies will have to implement robust internal processes including clear guidelines, technical barriers, oversight committees, and regular employee training. Companies should proactively review their internal activities, including R&D processes, and suspend any activities potentially violating the policy pending review. These will be analogous to safety protocols in the pharmaceutical industry, where companies maintain strict controls over drug development processes, implement multiple safety checkpoints, and provide ongoing training to ensure compliance with FDA regulations.

Researchers can self-organize by developing professional codes of conduct and establishing review boards to evaluate research proposals. Conferences and journals should update submission guidelines to require compliance certification. This self-regulation mirrors the peer review process in academic publishing, combined with ethics committees in medical research, ensuring that research meets both scientific and ethical standards before proceeding or being published.

Penalties for violations may include substantial fines, potential criminal charges, and bans from AI research. Companies may face license revocations, and violating systems may be decommissioned. This multi-faceted approach to enforcement is similar to environmental protection regulations, where violators face monetary penalties, operational restrictions, and mandated remediation actions, creating a strong deterrent against non-compliance.

Scope

What this policy affects:

At its core, this policy prohibits the development of AIs through software that has not been written fully by human developers. It ensures that any tool used in AI research has a minimum amount of legibility to human supervisors, to the extent that it has been built by human minds, instead of being discovered by illegible mathematical optimization processes.

This prohibition notably forbids:

Self-Improving found systems, such as an hypothetical LLM that would further train itself by generating data and optimization parameters.
Advanced AI systems being significantly involved in developing the next generation of those same systems, such as utilizing e.g., Claude 3.5 significantly in the production of Claude 4.0 or GPT-4 significantly in the production of GPT-5.
The direct use of any LLM in the training process of another LLM or AI system in general, including for generating training data, designing optimization algorithms, hyperparameter search.
The use of LLM and other found systems in distilling research insights from many sources that have direct impact on the design and improvement of found systems.

What this policy does not affect:

Most machine learning and all normal software (Microsoft Office, Email, Zoom) are not impacted by this prohibition, given that they don’t use found systems for their training or design.

The prohibition also does not impact found systems in cases of non-direct AI R&D use, such as searching for research papers on Google, letting Github Copilot correct typos and write trivial functions in a training codebase, or transcribing a research meeting using OtterAI.

4. Require a valid safety case for deployment of AI systems

Policy

For any deployed AI system, it is mandatory that for any capability of interest, there exists a reliable safety case for whether the AI systems will use this capability or not.

Capabilities of interest are any capabilities that are legally prohibited or restricted in a certain jurisdiction.

Definitions

Safety Case: An argument for why an AI system will not do a prohibited action, which can be made before deploying and running the AI.

Rationale

For sufficiently powerful AI systems, we need to know, before running and deploying them, that they will not cause a catastrophe. This is boundedness, and this policy enforces it through the use of safety cases, which are the standard high-risk industry method. Thus this policy satisfies the condition of “no unbounded AIs”.

This policy ensures the boundedness of AI systems in two ways.

First, it requires evidence that the AI system is bounded, in the form of a safety case. Thus whenever any powerful AI system is deployed, there is always a reasonable and legible argument that it will not use a given dangerous capability. This ensures that only bounded AIs are deployed.

Second, the burden of showing that an AI is not dangerous gets put squarely on the AI companies rather than falling on the regulators or the users. Thus this policy creates an incentive for investments into methods and paradigms that enable easy and reliable verification of bounds, for example interpretability, formal verification, and additional constraint on the structure of the AI systems being built.

Precedents

Any application of modern safety engineering requires the ability to model and predict in advance how the system under consideration will behave in various circumstances and settings. This knowledge is used in all critical and high-risk industries to check that the system fits with the safety requirements.

For example, all countries require guarantees that nuclear power plants will not have catastrophic failures, before fully building them. A concrete example of such guarantees and their justifications can be found in the Safety Assessment Principles of the UK’s Office For Nuclear Regulation.25

Implementation and enforcement

In practice, there will be trained inspectors who will check the safety case provided. It will be the responsibility of the company building the AI system to provide enough information, models and techniques for the inspector to be convinced that the AI system won’t use a given capability.

For the simplest possible AI systems, such as linear regressions, just showing the code will be all that is needed for justifying safety with regard to almost any capability of interest.

In some specialized AI systems, it might be possible to do so by showing that the AI systems won’t even learn the corresponding capability. For example, it’s reasonable to argue that a CNN26 (vision AIs) trained exclusively on classifying cancer x-rays would have no reason to learn how to model human psychology.

In the more advanced cases, it might be necessary to provide detailed mechanistic models of how the AI system works, for example for arguing that a SoTA LLM such as Claude or GPT-4 wouldn’t use any modeling of human psychology, since it definitely has the data, objectives, and incentives to learn how to do it and use it in practice.

For a start, the implementation might only focus on requiring safety cases for particularly dangerous capabilities (AI R&D, self-replication, modeling human psychology…). These are the bare minimum safety requirements, already increasingly required in multiple jurisdictions. Then the regulation can extend to more and more capabilities as they are linked to risks from advanced AIs.

Scope

What this policy affects:

This policy affects all AIs, but concentrates the costs on the most powerful forms of AI currently available, notably LLMs such as GPT-4 and Claude.

This is because there are no current methods to check that these AI systems lack any capability before running them: they are trained on data about almost everything known to man, are produced with massive amounts of compute and powerful architectures, and aim to predict everything in their training data, which might amount to predicting every process that generated that data.

Broadly, any AI system that is explicitly built for generality will not pass this policy unless significant improvements in interpretability, ML theory, and formal methods are made.

What this policy does not affect:

As discussed above, although this policy technically affects all AI systems, many simple and specialized ones will not incur much costs from the check.

This is because these systems would have highly specialized training data, often specialized architectures (like CNNs for vision models), and no reasons for learning any general or dangerous capabilities.

5. A licensing regime and restrictions on the general intelligence of AI systems

Policy

Countries should set up a national AI regulator that specifically enforces restrictions on the most capable AI systems, and undertakes continuous monitoring of AI research and development.

AI developers that are building frontier AI models, and compute providers whose services those models are built upon, should be subject to strict regulation in order to substantially mitigate the risks of losing control or enabling the misuse of advanced AI models. This regulation should take the form of a licensing regime, with three specific licenses being required depending on the development being taken place:

Training License (TL) - All AI developers seeking to train frontier AI models above the compute thresholds set by the regulator must apply for a TL and have their application approved prior to training the proposed model.
Compute License (CL) - All providers of cloud computing services and data centers operating above a threshold of 10^17 FLOP/s must obtain a license to operate these and comply with specific know-your-customer regulations as well as physical GPU tracking requirements.
Application License (AL) - Any developer seeking to use a model that has received an approved TL and will be expecting to make major changes, increases, or improvements to the capabilities of the model as part of a new application will need to apply and be granted an AL.

This balance will be critical to ensure that new applications of frontier AI models are safe but do not create undue burden or restriction on innovation. It will be for each nation to determine the best parameters for this, and for the international institutions to provide more detailed guidance as appropriate.

Rationale

Any regulation that actually constrains the development of advanced AIs needs concrete mechanisms by which it can enforce its policies, including punishing violation of prohibitions. The system of licences offers this, by forcing any actor involved in the creation and use of advanced AIs to have a valid license, which can be revoked as a sanction if regulations and sensible methods are not followed. This will address the condition of “limit the expected general intelligence of AI systems”.

In addition, licenses have the benefits of letting the regulator balance the risks and the value of the technology, that is maintaining beneficial AI development and use even while protecting against the dangerous parts.

Precedents

This system of licensing is the standard approach to regulation when a technology is both valuable and potentially dangerous. For example, drugs in the UK are regulated by the MHRA, and the use of nuclear energy is licensed in most countries, for example with the US Nuclear Regulatory Commission.

Implementation and Enforcement

General Licensing Regime

To create a sustainable licensing system, any national AI regulator must have adequate capabilities and capacity to monitor ongoing AI research and development, while also having suitable enforcement powers to catch bad actors trying to circumvent the requirements.

It’s also essential that the national AI regulator has adequate independence from political decision making and sufficient long-term funding that it can undertake its duties of ensuring advanced AI models are safe.

Fundamentally, the national regulators and international system must have powers to review and adapt licensing requirements - through their power to lower compute thresholds or add new behaviors that should be prohibited - to fit with the latest AI research and development. To inform this, the national AI regulators must have significant capacity to monitor developments in algorithms and data used.

When it comes to the enforcement of licenses, severe penalties should be levied against developers who seek to build models above a compute threshold or the defined intelligence benchmark without a license to do so, and those developers who have a license but fail to comply with the above requirements.

To ensure that AI developers continue to have adequate measures in place, national regulators should undertake frequent testing of the procedures that AI developers would employ to respond to dangers and safety incidents. In addition, the national regulators must work with compute and hardware providers to frontier companies to withdraw their services in the event that they detect illicit activity. It may also be necessary to conduct mock training runs to test compute providers’ ability to monitor the usage of their resources. Among other abilities, this could include their:

Capacity to shut-off access to compute once a training run exceeds permitted thresholds;
Ability to detect if a training run is simultaneously using other data centers;
Ability to check if model weights are at zero at the beginning of a training run.

There will always remain a slight risk that unlicensed developers make breakthroughs that circumvent the spirit of these regulations. It will be for the national regulators, and then the institution set up in Phase 1, to balance the risks of such breakthroughs with the cost of stifling innovation.

To ensure continued compliance, AI developers that received a TL or AL, or a computer provider who received a CL should be required to submit reports on safety procedures annually. A breach in the licensing requirements would need to face significant civil, and potentially criminal, action given the severity of the risks that it could pose. Below is a list of example enforcement powers that could be granted to the national regulator to help them fulfill their duties:

Immediately shutdown the ongoing R&D process (e.g., training runs, fine tuning processes) of an AI developer, and wait for a detailed risk and root-cause assessment before restarting;
The same as above, but for all similar projects across other companies and organizations developing AI;
All of the above, but also terminate the project permanently;
All of the above, but also terminate the project and all similar projects permanently in the company, and audit other companies and organizations to terminate similar projects due to similar risks;
All of the above, but also fire the team that conducted the project due to a breach in protocol;
All of the above, but also revoke the ability of the company to ever receive a future training or application license;
All of the above, but also prosecute members of the organization or company involved in breach of regulations;
In the most egregious cases, all of the above plus order a full shutdown of the entire company and sale of assets, via nationalization and auction or forced acquisition coupled with the wind down of all AI relevant operations.

Analogous powers should be provided to enforce KYC and similar requirements against compute providers. It is crucial that regulators should encourage true self-reporting of unexpected results, and provide some leniency when organizations do so proactively, swiftly, and collaboratively.

Additionally, regulators should proactively create a mechanism for companies to share “near-miss” reporting, analogous to the one implemented by the US FAA, such that they can proactively share insights about the ways in which accidents almost occurred but were avoided due to redundant measures and/or sheer luck, to inform the evolution of industry standards and regulatory efforts.

Training Licensing (TL)

Companies developing AI models above a specific level of intelligence (based on the proxies of compute and relevant benchmarks) would apply for a TL by pre-registering the technical details of their training run, outlining predicted model capabilities, and setting out what failsafes, shutdown mechanisms, and safety protocols would be in place.

The regulator would have scope to make recommendations and adjustments to this plan, adding or removing requirements as necessary. Once a plan is approved, the license to conduct the training run would be granted and reports would be provided by the developer during the training run to confirm the compute used.

Following a successful training run, the regulator would deploy a battery of appropriate tests to ensure the licensing requirements are met, with models that passed these tests being approved for direct commercial applications. For models trained in other countries, the applicant could move directly to the testing phase for approval or, in the event that the model has received approval from the regulatory authority of another country with a proven track record of high-quality decisions, would receive immediate approval subject to review by the domestic regulatory authority.

Given the exponential growth of AI, and the likelihood this growth will continue, agencies should be given maximum flexibility to ensure they can adequately assess models that pose the greatest risks and should apply for a TL. While the executives of these agencies would be appointed by and accountable to political leaders, and the specific governance of an AI regulator would need to be determined by each country, they should retain operational independence and have a permanent statutory footing and a minimum level of funding enshrined in law.

National AI regulators should set thresholds on compute to ensure proper oversight of frontier models that pose the greatest risk. These would be models where it is reasonably possible that training could lead to the development of dangerous capabilities that could either directly cause harm or result in the model escaping the developer’s control. All such frontier models would automatically require a TL for its training run, and would require a separate application license prior to deployment, whether in commercial applications or otherwise.

The relevant national AI regulator would have the authority to set and adjust these thresholds, with specific governance structures around these decisions varying from country to country. Once an international agreement defines global thresholds for permissible development, national regulators would transpose international guidance into their own domestic thresholds. Countries could also decide on a more restrictive regime with tighter thresholds than the international regime if desired.

In addition, even if a model falls below the pre-defined compute threshold but the AI systems are expected to exceed an established benchmark for general human capability, then it should also be required to apply for a TL. To implement this benchmark, the regulator would need to devise a battery of tests for each specific task and establish a human performance benchmark by deploying the test to workers across different professions and levels of qualification. Once a benchmark was established, these tests would be administered to automated systems; if the system being tested performed at or above a predetermined percentile of the human benchmark (e.g., 90th percentile), it would be determined to be proficient at the relevant task.

This general capabilities index would then be constructed from these tasks to produce a final score - if automated systems achieved general intelligence-equivalent performance in a predetermined share of these tasks, it would clear the threshold for general capability and be banned.

A potential set of general tasks to be cleared could be as follows:

Analyzing and Processing Data and Information
Communication and Collaboration (Internal)
Project Management and Resource Coordination
Developing and Implementing Strategies
- Fleshing out plans for complex real-world events for business operations and governmental activities.
Building and Maintaining Professional Relationships (External)
Interpreting and Presenting Information for Various Audiences
Content Creation
- Produce effective copy, images, videos, and other content to disseminate information, promote products and services, explain complex issues.
Training and Skill Development
- Non-project management and non-content feedback people management. Emotional guidance and coaching. Helping the other party reflect on past actions and teach new approaches and techniques.
Customer Relationship Management
Domain-Specific Novel Problem Solving

During the implementation phase, the regulator may decide to improve or expand on these tasks depending on how effectively they track model capabilities, with tests potentially requiring constant update and improvement.

As part of applying and receiving a TL, a developer would need to meet certain compliance requirements. Each jurisdiction will need to determine the appropriate number and type of any such requirements but at a minimum they should include the following:

Compliance requirement: companies applying for a TL would be required to submit their strategies for AI risk mitigation to the regulator as a pre-condition. While these licenses would be specific to the model or application being developed, the AI risk mitigation strategies would refer to the applicants and their own risk management processes. That is to say: in order to apply for a license, the applicant must have had a relevant AI risk mitigation strategy approved by the regulator beforehand. This would also apply for requests to develop applications based on frontier AI models that increased model capabilities as defined by the regulator.

Compliance requirement: developers must not ‘Open Source’ or publicly release any part of the code or model weights. This licensing regime seeks to drive and incentivize a safety-driven approach to model development. Releasing a model’s code publicly for viewing, adaptation, or use undermines this as it would enable the model to be significantly altered by unregulated actors post-hoc. Therefore, any new model or application that is captured by the licensing regime must not be open sourced.
Instead, external entities will be able to get meaningful access via API, which developers will be required to keep while the model meets the relevant threshold for frontier models. Failure to comply with this should result in severe penalties, including but not limited to: the model being instantly shut down and the developer having their license removed, fines for the developer, and criminal action taken against those involved in releasing the model publicly and found to be using the code in any other application.

Compliance requirement: developers must have mechanisms to shutdown their model and application if required temporarily or permanently. AI is still an immature field; practitioners often report that they do not fully know how relatively-modest changes to architecture or algorithms will impact the capabilities or risks of a model. Accordingly, the R&D and deployment processes must be treated as inherently less certain than, for example, traditional mechanical engineering, and as having some risks of generating significant disaster.
It is not guaranteed that we will have any observable warning signs before an R&D effort goes catastrophically wrong. However, right now humanity does not have processes to systematically detect warning signs, nor do we have systematic processes to investigate them, take corrective action, and learn from the issue and disseminate corrective fixes broadly.
Therefore, in order to have a license for training and deploying frontier models, developers must document and prove to the regulator that they have clear and stress-tested measures in place for how to shutdown a model. As with failure to comply with the license obligations, failure to perform a required shutdown, or negligent failure to maintain and regularly test shutdown capabilities, would result in the revocation of their frontier AI license.

Compute Licensing (CL)

The operation of data centers and provision of cloud computing services above a predetermined threshold of compute should be subject to the issuance of a license by the relevant national regulatory authority. Possessing a license should be a precondition to being able to operate and provide services over that predetermined threshold to companies in that jurisdiction.

Cloud computing services are integral to nearly all advanced artificial intelligence development and applications, from training to inference. Through the identification of relevant clusters and by placing meaningful constraints on their capacity, regulators can deploy effective brakes on the development of models and limit access to applications displaying concerning capabilities.

The operation of large-scale data centers is relatively easy to observe and monitor, given their large land requirements detectable via the planning system, their physical footprint making them often observable via satellite, and their large energy consumption. Their fixed location and large footprint makes them a natural chokepoint for regulators to monitor and intervene on, as well as a natural focus for mutual verification under international agreements.

The proposal introduces a licensing requirement for any company operating data centers with a total compute capacity of 10^17 FLOP/s. This regime will ensure that larger, more resource-intensive facilities are subject to oversight and must meet relevant regulatory requirements.

Each jurisdiction will need to determine the number and nature of the requirements on compute providers to successfully be granted a CL, however, at a minimum the following requirements should be implemented:

Compliance requirement: compute providers must implement ‘Know Your Customer (KYC) Rules’.27 Companies must adhere to KYC regulations tailored for the cloud computing and AI industries28, which require them to verify the verifying client identities, tracking the use of compute resources, and reporting any high-risk entities to the government. This is intended to close existing gaps in export controls, prevent misuse of advanced AI technologies, and support responsible AI development by enabling more precise and targeted regulatory interventions.

Compliance requirement: compute providers must have adequate hardware tracking capabilities. Companies will be required to track the physical hardware used in their data centers. While this may eventually involve the use of secure GPUs with serial numbers and physical tracking capabilities, aligning with relevant export controls, that technology is not yet widely available. An interim requirement29 could be implemented, where companies would use physical GPS trackers on their existing hardware to comply with tracking and security standards.

Compliance requirement: compute providers must implement shutdown mechanisms. In tandem with the shutdown measures highlighted in the implementation of TLs, compute providers must be clearly identified through redundant reporting chains to regulators – both by the frontier AI developers themselves, and through a KYC-like reporting process by compute providers and other supply chain participants. This would enable randomized spot checks by auditors to confirm if frontier AI companies have properly coordinated with their supply chain and counterparties and arranged for shutdown procedures to be implemented. Therefore, in the case of an emergency, a compute provider and/or an AI developer can be called upon to shutdown the model. In addition, this would strongly incentivise frontier AI companies to only use the compute providers with the most rigorous safety protocols.

It is likely that through the introduction of this CL, a change in incentives will mean new technologies will emerge over time that will assist the compute supply chain in being able to control the use of their resources and help with the enforcement of license requirements. For instance, in the future, the national AI regulator could make it a requirement that in order to receive a license, the AI developer must use hardware providers that have Hardware-Enabled Governance Mechanisms (HEMs) so that they can remotely deactivate chips if they are either ordered to do so by the national regulator.

Application Licensing (AL)

Any new use of an AI model approved through the TL process would need to seek approval for that new use. This is to ensure that any additional capabilities the new use creates are in keeping with the original approval of the TL and that restrictions, such as prohibited behaviors like self-replication, are not developed on top of pre-approved models. This would include connecting to an AI model through an API for it to run some or all of your product, or undertaking additional fine-tuning or research on said model.

Depending on the extent of the modifications to the base model or the exact proposed use, the applicant would be required to demonstrate the capabilities its proposed application would have and set out any additional relevant safety features and protocols that may be needed. If the regulator is satisfied that there was no significant risk to deployment, it would authorize the requested use. Any applications that do not change or modify the base model’s capabilities, and do not result in structural manipulations like using it to train a smaller model or creating multimodal capabilities, would receive an automatic authorisation upon submission.

Applications based on models that had received a TL would be required to submit notification to the regulator. It would be the duty of the applicant to confirm whether their application is designed to increase the model’s capabilities or not. An automatic AL would be granted to applicants but the national AI regulator would be able to identify any concerning applications and take further investigations or enforcement action if necessary. This ensures a streamlined process for deploying new applications while maintaining regulatory awareness and oversight of the use of advanced AI systems.

Specifically, anyone seeking an AL should confirm their application will not draw on further compute resources for training such as using a TL model to train a smaller model, and that the application will not exceed the benchmark for human capabilities defined by the TL. This benchmark serves as a clear, measurable threshold for an acceptable application.

To maintain regulatory control, applications could be shut down on short notice through a shutdown of the underlying model or the relevant compute cluster. This mechanism provides the regulator with the ability to quickly intervene if necessary, balancing innovation with potential risks.

Scope

What this policy affects:

The licensing regime should focus only on the most capable and general AI systems.

As noted, managing the extent of AI models’ general intelligence is a key element of this and fundamentally the implementation of the TL and CL seeks to drive and incentivize a safety-driven approach to frontier AI model development and use by including specific requirements and a pre-defined procedure for assessing models and applications.

Similarly, the AL only affects new applications, such as a commercial or non-commercial product, service, suite of products/services, or research project, that are based on a model trained using more compute than the thresholds defined for the TL.

What this policy does not affect:

Companies developing models and applications below the relevant compute and intelligence thresholds would not require licenses to operate and develop these products and services. However, companies would be expected to comply with the relevant regulatory limits, under penalty of severe legal repercussions in the event that thresholds are exceeded and companies operate beyond these thresholds without a license.

6. An International Treaty Establishing Common Redlines on AI Development

Policy

Alongside implementing the above measures nationally, countries should agree to them through an international treaty that creates a common regulatory framework across all signatory countries.

These measures are the ones described in the rest of Phase 0.

Create an international compute threshold system, designed to keep AI capabilities within estimated safe bounds.
Prohibit the development of superintelligent AI.
Prohibit unauthorized self-replication and the intentional development of systems capable of self-replicating
Prohibit unauthorized recursive self-improvement and the intentional initiation of recursive self-improvement activities.
Require states to establish regulators and implement licensing regimes.

In addition to internationalizing the other measures of Phase 0, the Treaty should include a provision to prohibit the use of AI models developed within non-signatory states. This is to incentivize participation in the Treaty, to prevent actors within the signatory states from circumventing the Treaty, and to simplify monitoring and enforcement.

Rationale

While countries can unilaterally implement the proposed measures in Phase 0, in doing so they would not have guarantees from other countries that they would do the same. Individual countries are currently incentivised to avoid implementing regulatory frameworks out of fear that other countries would be able gain a competitive advantage by implementing more lenient regulatory regimes.

These competitive dynamics may limit the potential for unilateral action, and therefore it is necessary for countries to agree to and commit to these redlines internationally. An international framework could avoid competitive pressures pushing regulatory standards to unacceptably low levels in a race to the bottom.

This will address all of the conditions: “limiting the general intelligence of AI systems”, “no AIs capable of breaking out of their environment”, “no AIs improving AIs”, and “no unbounded AIs”.

Precedents

Treaties are the standard mechanism to transform national rules into international law. There are multiple existing treaties to deal with high-risk technologies internationally, such as the significant, albeit imperfect, successes of the the Chemical Weapons Convention and the Treaty on the Non-Proliferation of Nuclear Weapons.

Implementation and enforcement

Countries should sign and ratify a treaty that both internationalizes the prohibitions of Phase 0, and establishes a compute Multi-Threshold System.

This treaty should then be enforced via the passage of national legislation.

This treaty will establish a Multi-Threshold System to determine the acceptable levels of compute. This will serve to harmonize the compute thresholds established by national licensing within an international treaty framework. Here is how the system will function.

Multi-Threshold System

Under the auspices of an international treaty, the compute thresholds established via the national licensing regime of Phase 0 should be internationally harmonized.

In doing so, an internationally upheld three limit system should be established, consisting of lower, middle, and upper limits. The lower level will be broadly permitted; the middle level, only by licensed entities; the upper level, only by an international institution with broad support across the international community, including the US and China, which we will label GUARD (Global Unit for AI Research and Development), and which is further developed and explained in Phase 1.

With these thresholds we aim to target:

The capabilities of models trained, using total FLOP training compute as a proxy.
The speed at which models are trained, using the performance of computing clusters in FLOP/second.

We can target capabilities in order to keep models within estimated safe bounds. We can also target the speed of training to limit the breakout time30 to attain dangerous capabilities for legal computing clusters conducting an illegal training run, providing time for authorities to intervene. This will be achieved by targeting the total throughput (as measured in FLOP/s - floating point operations per second) that a compute cluster can achieve in training.

These thresholds should be lowered as necessary, to compensate for more efficient utilization of compute (see below). This should be done by an international institution with broad support across the international community, which we will call the International AI Safety Commission (IASC). The upper threshold may be raised under certain conditions defined by a comprehensive AI treaty.

Note: In each limit regime, the largest permitted legal training runs could be run as quickly as within 12 days. For more information, see annex 2.

This compute threshold system should reflect the latest evidence to keep model capabilities within estimated safe bounds. The compute differences between the thresholds are designed to limit the breakout time of dangerous capabilities emerging through an illegal training run, thus providing time for authorities to intervene. This will be achieved by targeting the total throughput (as measured in FLOP/s - floating point operations per second) that a compute cluster can have in training.

Any AI system that passes a general intelligence benchmarking test is considered to be equivalent to having breached the Upper Compute Limit, and is thus also prohibited.

11 https://www.nist.gov/metrology

12 https://en.wikipedia.org/wiki/Defense_in_depth_(computing)

13 https://www.govinfo.gov/content/pkg/COMPS-1630/uslm/COMPS-1630.xml

14 https://disarmament.unoda.org/biological-weapons/

15 Note our discussion of safe harbors for security research below.

16 With additional stringencies or tailoring where needed based on the specific work being done, as in other regulatory processes.

17 For example, a LLM that when asked a question that requires inference compute capacity in excess of its current resources, and responds by gaining unauthorized access to another compute cluster to complete its work.

18 Analogous to how e.g., financial services industries have formal requirements, but also invest significantly in technology to ensure protections from fraud and other attackers.

19 Some limited amounts of exemptions may be implemented for pre-approved activities conducted in good faith by security researchers. A common failure mode of policies intended to enhance security is that they actually harm security by banning researchers from conducting research into failure modes of a security system. On such an important matter, we must not have a false sense of security. We must ensure that security researchers have appropriate safe-harbor exemptions, tailored in partnership with those researchers, to conduct and disclose research into how AI models that are designed to not conduct unauthorized access (e.g., should refuse requests to write a virus) can be tricked into doing so, such that they can disclose such flaws in good faith without fear of punishment to enable remediation of such issues.

20 Note: to be successful, these laws will have to be buttressed by strong norms that focus legal enforcement on the highest-risk scenarios. It took the legal system decades to properly focus its efforts of combatting unauthorized access on the most harmful actors, with much prosecutorial overreach on low-impact cases in the short term, as legal authorities across the spectrum have noted, which sabotaged the development of helpful norms and relationships in the information security field that could orchestrate efforts to stop unauthorized access. We do not have the time to repeat these mistakes.

21 https://blog.google/products/search/how-ai-powers-great-search-results/

22 https://blogs.microsoft.com/blog/2023/03/16/introducing-microsoft-365-copilot-your-copilot-for-work/

23 https://www.zoom.com/en/ai-assistant/

24 https://en.wikipedia.org/wiki/Atomic_Energy_Act_of_1954

25 “The underpinning safety aim for any nuclear facility should be an inherently safe design,

consistent with the operational purposes of the facility.

An ‘inherently safe’ design is one that avoids radiological hazards rather than

controlling them. It prevents a specific harm occurring by using an approach, design

or arrangement which ensures that the harm cannot happen, for example a criticality

safe vessel.” (EKP.1, p.37 of 2014 version)

26 https://en.wikipedia.org/wiki/Convolutional_neural_network

27 This is similar to what has been proposed by some companies.

28 See this for a more detailed proposal.

29 See this proposal for more detail.

30 https://en.wikipedia.org/wiki/Nuclear_proliferation#Breakout_capability

31 We can use the relationship: Cumulative training compute [FLOP] = Computing power [FLOP/s] * Time [s]. By controlling the amount of computing power that models can be trained with, we can manage the minimum amount of time that it takes to train a model with a particular amount of computation. Our aim in doing this is to control breakout times for licensed or unlicensed entities engaged in illegal training runs to develop models with potentially dangerous capabilities – providing time for authorities and other relevant parties to intervene on such a training run.

Introduction

Phase 1

Get Updates

Sign up to our newsletter if you'd like to stay updated on our work,
how you can get involved, and to receive a weekly roundup of the latest AI news.

If you have feedback on The Plan or want to know how you can help to support it please get in touch with us directly

If you have feedback on The Plan or want to know how you can help to support it please get in touch with us directly

Sign up to our newsletter if you'd like to stay updated on our work,
how you can get involved, and to receive a weekly roundup of the latest AI news.

Introduction

Phase 0

Phase 1

Phase 2

What Success Looks Like

Annexes

Phase 0: Safety

Conditions

No AIs improving AIs

No AIs capable of breaking out of their environment

No unbounded AIs

Limit the expected general intelligence of AI systems

Summary

1. Prohibit the development of Superintelligent AI

Policy

Definitions

Rationale

Precedents

Implementation and enforcement

Scope

2. Prohibit AIs capable of breaking out of their environment

Policy

Definitions

Rationale

Precedents

Implementation and enforcement

Scope

3. Prohibit the development and use of AIs that improve other AIs

Policy

Definitions

Rationale

Precedents

Implementation and enforcement

Scope

4. Require a valid safety case for deployment of AI systems

Policy

Definitions

Rationale

Precedents

Implementation and enforcement

Scope

5. A licensing regime and restrictions on the general intelligence of AI systems

Policy

Rationale

Precedents

Implementation and Enforcement

General Licensing Regime

Training Licensing (TL)

Compute Licensing (CL)

Application Licensing (AL)

Scope

6. An International Treaty Establishing Common Redlines on AI Development

Policy

Rationale

Precedents

Implementation and enforcement

Multi-Threshold System

Introduction

Phase 1

Get Updates