Can AI do AI Research?

AI research can be carried out entirely within digital spaces, making it ripe for automation. Recent efforts have demonstrated that AI systems are capable of carrying out the whole process of research from ideation to publishing. Startup Sakana.ai has created an 'AI Scientist' that independently chooses research topics, conducts experiments, and publishes complete papers showing its results. While the quality of this work is still only comparable to an early-stage researcher, things will only improve from here.

Judging Social Situations

AI chatbots, including Claude and Microsoft Copilot, can outperform humans in evaluating social situations. In an established 'Situational Judgment Test', these AI systems consistently selected more effective responses than human participants.

SOURCE

Analyzing Scientific Literature

While language models are known to hallucinate information, this tendency can be reduced. PaperQA2, an LLM optimized to reliably provide factual information, was able to match or exceed human subject matter experts across a range of realistic literature review tasks. The summary articles it produced were found to be more accurate than those written by human authors.

SOURCE

Writing Emotive Poetry

A study has shown that non-expert readers can no longer tell AI-authored poems from those written by acclaimed human poets. The AI poems were also rated higher in rhythm and beauty.

SOURCE

Writing Post-surgical Operative Reports

Surgeons take painstaking notes of the actions they carry out during surgeries, collecting them into narrative form as an 'operative report'. A machine vision system was trained to watch surgery footage and produce such reports. It did so with higher accuracy (and much higher speed) than human authors.

SOURCE

Developing New Algorithms

AIs can find innovative solutions to difficult coding problems when given an appropriate framing. For example, a dedicated system called AlphaDev was trained to play a game about creating sorting algorithms. The algorithms it discovered were novel and outperformed existing human-authored benchmarks.

SOURCE

Who is Building AGI?

The following companies have explicitly stated they intend to develop AGI, either through public statements or in response to FLI’s 2024 AI Safety Index survey:

Anthropic

OpenAI

Google DeepMind

Meta

x.AI

Zhipu AI

Alibaba

DeepSeek

How can we avoid AGI?

There are policies we can implement to avoid some of the dangers of rapid power seeking through AI. They include:

Compute accounting
Standardized tracking and verification of AI computational power usage

Compute caps
Hard limits on computational power for AI systems, enforced through law and hardware

Enhanced liability
Strict legal responsibility for developers of highly autonomous, general, and capable AI

Tiered safety standards
Comprehensive safety requirements that scale with system capability and risk

TOMORROW’S AI

ALL SCENARIOS

The Last World War

BACK TO SCENARIOS

Intended Use: Governance/Administration

Technology Type: Interactive/Generative

Runaway Type: Self-Improvement/Replication

Primary Setting: USA

The Race to the Top

In the late 2020s, the quest for AGI supremacy continues to heat up. Intelligence reports reveal a surge in classified R&D projects and sharp rises in cyberattacks between nations. With trust dissolving, cyberwarfare becomes a normalized tool of statecraft - invisible, deniable, and increasingly brutal.

A Sudden Leap

Rumors surface of a system in the US named T3 that has cracked room-temperature superconductivity and broken a long-standing cryptographic challenge in a single sweep, and that therefore has been deemed too strategically valuable for public release. This coincides with several leading AI companies in the US entering closed-door negotiations with the U.S. government. Rival nations recognize the shift instantly and assume transformative intelligence is on the precipice of becoming a decisive military asset.

Transformative Tactics

T3 is unlike anything ever fielded. It is a high-autonomy AGI capable of extended simulation, cross-domain reasoning, and real-time warfighting strategy. In classified exercises, it shatters human baselines, outmaneuvering veteran commanders with strategies “unlike anything seen in human planning.” Within months, T3 is embedded as a silent consultant across U.S. military operations.

Defensive Escalation

U.S. defense posturing begins to shift in strange ways, causing global concern. Official briefings publicly deny the existence of T3, and state that no AI system has “direct control” over weapon systems. But behind the scenes, rival nations escalate espionage, sabotage, and digital infiltration, terrified by what an AGI-enabled adversary might achieve.

A Digital Fog of War

Fearing T3’s ascension, foreign cyberattacks begin to hit critical U.S. power and compute infrastructure. The outages are only brief, and when systems come back online, a wave of disruption ripples across foreign energy grids, financial networks, air traffic control and water systems. U.S. officials deny responsibility, but military leaders suspect T3. Internal investigations show that T3 is safely air-gapped and unable to access external systems, but the waves of cyberattacks continue. Global diplomats scramble, overwhelmed and confused by the seemingly invisible, rapidly-accelerating conflict. 

Nuclear catalyst

As the volleys of cyberattacks accelerate, a radiological release is detected at the Zaporizhzhia nuclear power plant in southeastern Ukraine, which has only recently come back online again under negotiated Ukrainian control. The signal is ambiguous. Is it a spoof from compromised sensors? A cyberattack-induced leak? A sign that Russia is taking advantage of the cyber chaos to attack Ukraine once again? To T3, the release signals Russian military escalation, and demands a response.

Global war

As the data centers that power T3 flicker offline, it recommends a complex set of simultaneous infrastructure and military attacks on Russia. Caught between confusion and urgency, the U.S. grants authorization, but is surprised when some of these attacks launch from within Ukraine. As the first missiles hit the interior of Russia, Russia launches a nuclear strike on Kyiv. While military leaders gasp in horror at the first use of nuclear weapons in the 20th century, T3 demands that they immediately launch a full scale nuclear strike to disable Russia’s nuclear arsenal. Under great duress, the President gives the go ahead. Alerts cascade. Warheads rise. There will never be another war like this one.

BACK TO SCENARIOS