Opinion | We need a Manhattan Project for AI security

At the heart of the threat is what’s called the alignment problem, the idea that a powerful computer brain may no longer be aligned with the best interests of human beings. Unlike equity or job loss, there are no obvious political solutions to alignment. It’s a highly technical problem that some experts fear may never be fixable. But the government does have a role to play in tackling huge and uncertain problems like this. Indeed, it may be the most important role it can play on AI: funding a research project on the scale it deserves.

There’s a successful precedent for this: The Manhattan Project was one of the most ambitious technology ventures of the 20th century. At its peak, 129,000 people worked on the project at sites across the United States and Canada. They were trying to solve a problem that was fundamental to national security and that no one was sure could be solved: how to harness nuclear energy to build a weapon.

Some eight decades later, the need arose for a government research project to match the original scale and urgency of the Manhattan Projects. In some ways the goal is exactly the opposite of the first Manhattan Project, which opened the door to previously unimaginable destruction. This time, the goal must be to impede unimaginable destruction, as well as destruction that is simply hard to predict.

The threat is real

Don’t take it from me. Expert opinion differs only on whether the risks of AI are unprecedented or literally existential.

The scientists who laid the foundations for today’s AI models are also sounding the alarm. More recently, the Godfather of AI himself, Geoffrey Hinton, left his job at Google to call attention to the risks that AI poses to humanity.

It may sound like science fiction, but it’s a reality that is rushing towards us faster than almost anyone expected. Today, progress in AI is measured in days and weeks, not months and years.

Just two years ago, forecasting platform Metaculus put the likely arrival of a weak general AI into a unified system that could compete with the typical college-educated human in most tasks around the year 2040.

Now forecasters predict that AGI will arrive in 2026. Powerful AGIs with robotic capabilities equaling or surpassing most humans are predicted to emerge just five years later. With the ability to automate AI research itself, the next milestone would be a superintelligence with unfathomable power.

Don’t count on the normal government channels to save us That.

Policy makers cannot afford a lengthy interagency process or notice and comment period to prepare for what is to come. Conversely, making the most of AI’s massive advantage while avoiding catastrophe will require our government to stop taking a backseat and act with an agility not seen in generations. Then the need for a new Manhattan Project.

The research agenda is clear

In Manhattan Project for X is one of those clichés of American politics that rarely deserves the hype. AI is the rare exception. Ensuring AGI develops safely and for the betterment of humanity will require public investment in focused research, high levels of public and private coordination, and a leader with the tenacity of General Leslie Groves, the project’s infamous supervisor, whose style of aggressive, top-down leadership mirrored that of a modern technology CEO.

I’m not the only person to suggest this: AI thinker Gary Marcus and legendary computer scientist Judea Pearl he recently approved the idea as well, at least informally. But what exactly would it look like in practice?

Fortunately, we already know enough about the problem and can sketch out the tools we need to address it.

One problem is that large neural networks like GPT-4, the generative AIs that are causing the most concern right now, are mostly a black box, with reasoning processes we cannot yet fully understand or control. But with the right setup, researchers can in principle perform experiments that uncover particular circuitry hidden within billions of connections. This is known as mechanistic interpretability research, and it is the closest thing we have to neuroscience for artificial brains.

Unfortunately, the field is still young and far behind in understanding how current models do what they do. The ability to run experiments on large, unrestricted models is mostly reserved for researchers within major AI companies. The paucity of opportunity in mechanistic interpretation and alignment research is a classic public goods problem. Training large AI models costs millions of dollars in cloud computing services, especially if you repeat different configurations. Private AI labs are therefore reluctant to burn capital on training models with no commercial purpose. Government-funded data centers, by contrast, would have no obligation to return value to shareholders and could provide free computing resources to thousands of potential researchers with ideas to contribute.

The government could also guarantee the proceeds of research into relative safety and provide a central link for experts to share their knowledge.

With all of this in mind, an AI security Manhattan Project should have at least 5 main functions:

1. A coordination role would be needed, bringing together the leadership of leading AI companies OpenAI and its main competitors, Anthropic and Google DeepMind to disclose their plans confidentially, develop shared security protocols and prevent the current dynamic of the race to the armaments.

2. It would draw on their talent and experience to accelerate the construction of government-owned data centers operated with maximum security, including an air gap, a deliberate disconnection from external networks, ensuring that future more powerful AIs are unable to escape open internet. Such facilities would likely be overseen by the Department of Energy’s Office of Artificial Intelligence and Technology, given its existing mission to accelerate the demonstration of reliable AI.

3. It would force participating companies to collaborate on security research and alignment, and would require models that pose security risks to be trained and extensively tested in secure facilities.

4. It would provide public testbeds for academic researchers and other outside scientists to study the insides of large models like GPT-4, building heavily on existing initiatives like the National AI Research Resource and helping to grow the nascent field of AI interpretability ‘AI.

5. And it would provide a cloud platform for training advanced AI models for needs within government, ensuring the privacy of sensitive government data and serving as a front against corporate power on the run.

The only way out is through

The alternative to a massive public effort like this attempt to kick the AI ​​problem a bit isn’t going to cut it.

The only other serious proposition right now is a pause on new AI development, and even many tech skeptics consider it unrealistic. It could even be counterproductive. Our understanding of how powerful AI systems could go rogue is immature at best, but it is set to improve dramatically through continued testing. particularly of larger models. Air-gapped data centers will therefore be essential for experiencing AI failure modes in a secure environment. This includes pushing models to the limit to explore potentially dangerous emerging behaviors, such as deception or power-seeking.

The Manhattan Project analogy isn’t perfect, but it helps contrast with those who argue that the security of AI requires a pause in research into more powerful models. The project was not aimed at slowing down the construction of atomic weapons, but at controlling it.

Even if AGIs end up being further along than most experts expect, an AI security Manhattan Project is unlikely to go to waste. Indeed, many less-than-existent AI risks are already upon us, requiring aggressive research into mitigation and adaptation strategies. So what are we waiting for?

#Opinion #Manhattan #Project #security

Leave a Comment