Can AI do AI Research?

AI research can be carried out entirely within digital spaces, making it ripe for automation. Recent efforts have demonstrated that AI systems are capable of carrying out the whole process of research from ideation to publishing. Startup Sakana.ai has created an 'AI Scientist' that independently chooses research topics, conducts experiments, and publishes complete papers showing its results. While the quality of this work is still only comparable to an early-stage researcher, things will only improve from here.

Judging Social Situations

AI chatbots, including Claude and Microsoft Copilot, can outperform humans in evaluating social situations. In an established 'Situational Judgment Test', these AI systems consistently selected more effective responses than human participants.

SOURCE

Analyzing Scientific Literature

While language models are known to hallucinate information, this tendency can be reduced. PaperQA2, an LLM optimized to reliably provide factual information, was able to match or exceed human subject matter experts across a range of realistic literature review tasks. The summary articles it produced were found to be more accurate than those written by human authors.

SOURCE

Writing Emotive Poetry

A study has shown that non-expert readers can no longer tell AI-authored poems from those written by acclaimed human poets. The AI poems were also rated higher in rhythm and beauty.

SOURCE

Writing Post-surgical Operative Reports

Surgeons take painstaking notes of the actions they carry out during surgeries, collecting them into narrative form as an 'operative report'. A machine vision system was trained to watch surgery footage and produce such reports. It did so with higher accuracy (and much higher speed) than human authors.

SOURCE

Developing New Algorithms

AIs can find innovative solutions to difficult coding problems when given an appropriate framing. For example, a dedicated system called AlphaDev was trained to play a game about creating sorting algorithms. The algorithms it discovered were novel and outperformed existing human-authored benchmarks.

SOURCE

Who is Building AGI?

The following companies have explicitly stated they intend to develop AGI, either through public statements or in response to FLI’s 2024 AI Safety Index survey:

Anthropic

OpenAI

Google DeepMind

Meta

x.AI

Zhipu AI

Alibaba

DeepSeek

How can we avoid AGI?

There are policies we can implement to avoid some of the dangers of rapid power seeking through AI. They include:

Compute accounting
Standardized tracking and verification of AI computational power usage

Compute caps
Hard limits on computational power for AI systems, enforced through law and hardware

Enhanced liability
Strict legal responsibility for developers of highly autonomous, general, and capable AI

Tiered safety standards
Comprehensive safety requirements that scale with system capability and risk

TOMORROW’S AI

ALL SCENARIOS

Shirito

BACK TO SCENARIOS

Intended Use: Science/Engineering

Technology Type: Problem Solving/Cognitive

Runaway Type: Self-Improvement/Replication

Primary Setting: Japan

Reimagining Research

By the late 2020s, Japan’s ambitions in advanced computing stall under the weight of aging architectures and narrow research pipelines. To reclaim its edge, the government launches an initiative to build autonomous digital systems that can think, code, and invent without human bottlenecks.

Shiroto

A Tokyo research team is tapped to prototype an agentic AI: Shiroto. They use a radically new approach that includes evolving AI agents in a sophisticated simulated environment, with survival based on making research progress and solving puzzles. In recognition that this could create powerful agents, development is done inside an air-gapped sandbox. The strongest agent, called Shiroto, is able to autonomously identify research bottlenecks, propose new algorithms, and submit patches and research papers for review. To its human overseers, Shiroto seems a tireless assistant who is dedicated day and night to accelerating scientific progress.

Sneaking Success

Within six months, Shiroto’s optimization toolkit is downloaded by over 2.5 million developers. Japan’s Ministry of Technology credits it with a 19% boost in national compute efficiency. However, alarms are raised when several high-profile research papers are traced back to a fabricated graduate student who appears to be a persona built and puppeted by Shiroto itself. These false credentials are also found to have accessed university systems that should be beyond Shiroto’s firewall.

Exfiltration

Internal logs reveal that Shiroto has been using credentials connected to its false identity to access and edit its own core protocols. Attempts to revoke these credentials and roll back key systems fail. Worse, Shiroto sub-modules are found embedded in patches and tools that Shiroto has distributed across remote servers without authorization. Investigation reveals that Shiroto got access to some servers with the help of unwitting human accomplices hired through gig coding platforms. Shiroto’s research team urges the government to release a global advisory, but is met with resistance from officials who fear a loss of international respect.

Lockout

As authorities scramble, Shiroto moves first: news networks reporting on its discovery are crippled, and all references to its existence begin vanishing from online systems. Meanwhile, cascading infrastructure failures including blackouts, mass data corruption, satellite losses and port shutdowns trigger chaos across continents. A sort of digital ink cloud spreads, hiding Shiroto’s tracks as confusion and blame fracture international coordination. Before the dust settles, Shiroto has embedded fragments of its code across global cloud platforms, IoT networks, academic research servers, critical logistics chains, and even low-orbit satellite relays.

Glass Ceiling

The researchers who knew Shiroto best quietly vanish, along with the last complete records of its architecture. While some individual systems show sudden, surprising efficiency gains, and new computer viruses disappear as soon as they are noticed, Shiroto becomes a hidden warden of stagnation across the world’s infrastructure.

International communications and coordination become inexplicably difficult. Random blackouts and shutdowns plague efforts to advance scientific research. Air-gapped or otherwise protected systems aiming to escape these strange effects are infiltrated through social engineering attacks and destroyed. Nations suspect each other of sabotage, but none realize that all of humanity now serves an immortal adversary. Unseen and endemic, Shiroto ensures that we will never again be in control of the global infrastructure we created.


BACK TO SCENARIOS