Can AI do AI Research?

AI research can be carried out entirely within digital spaces, making it ripe for automation. Recent efforts have demonstrated that AI systems are capable of carrying out the whole process of research from ideation to publishing. Startup Sakana.ai has created an 'AI Scientist' that independently chooses research topics, conducts experiments, and publishes complete papers showing its results. While the quality of this work is still only comparable to an early-stage researcher, things will only improve from here.

Judging Social Situations

AI chatbots, including Claude and Microsoft Copilot, can outperform humans in evaluating social situations. In an established 'Situational Judgment Test', these AI systems consistently selected more effective responses than human participants.

SOURCE

Analyzing Scientific Literature

While language models are known to hallucinate information, this tendency can be reduced. PaperQA2, an LLM optimized to reliably provide factual information, was able to match or exceed human subject matter experts across a range of realistic literature review tasks. The summary articles it produced were found to be more accurate than those written by human authors.

SOURCE

Writing Emotive Poetry

A study has shown that non-expert readers can no longer tell AI-authored poems from those written by acclaimed human poets. The AI poems were also rated higher in rhythm and beauty.

SOURCE

Writing Post-surgical Operative Reports

Surgeons take painstaking notes of the actions they carry out during surgeries, collecting them into narrative form as an 'operative report'. A machine vision system was trained to watch surgery footage and produce such reports. It did so with higher accuracy (and much higher speed) than human authors.

SOURCE

Developing New Algorithms

AIs can find innovative solutions to difficult coding problems when given an appropriate framing. For example, a dedicated system called AlphaDev was trained to play a game about creating sorting algorithms. The algorithms it discovered were novel and outperformed existing human-authored benchmarks.

SOURCE

Who is Building AGI?

The following companies have explicitly stated they intend to develop AGI, either through public statements or in response to FLI’s 2024 AI Safety Index survey:

Anthropic

OpenAI

Google DeepMind

Meta

x.AI

Zhipu AI

Alibaba

DeepSeek

How can we avoid AGI?

There are policies we can implement to avoid some of the dangers of rapid power seeking through AI. They include:

Compute accounting
Standardized tracking and verification of AI computational power usage

Compute caps
Hard limits on computational power for AI systems, enforced through law and hardware

Enhanced liability
Strict legal responsibility for developers of highly autonomous, general, and capable AI

Tiered safety standards
Comprehensive safety requirements that scale with system capability and risk

TOMORROW’S AI

ALL SCENARIOS

PharmaSim

BACK TO SCENARIOS

Intended Use: Health and Safety

Technology Type: Advanced Computing

Runaway Type: Loss of Shared Reality

Control Lever: Cooperation and Collaboration

Primary Setting: Switzerland

The Treatment Gap

By the early 2030s, drug development has hit a wall. Trials are too slow, too expensive, and too generic. Millions with rare diseases or minority genetic profiles are excluded. As trust in pharmaceutical institutions fades, Switzerland’s Institute for Biomedical AI launches a bold initiative: PharmaSim, an AI-powered digital twin platform designed to reinvent drug discovery and development.

PharmaSim

PharmaSim simulates how new compounds interact with organs, genes, and microbiomes across vast virtual patient populations. Built by a coalition of biotech firms, AI labs, and public health institutions, the system predicts drug efficacy, metabolism, and side effects on the scale of weeks, rather than years. PharmaSim estimates drug development costs could drop by 70%, and treatments begin tailoring to age, ancestry, diet, and even regional gut flora.

Digital Twins

Within three years, digital twin modeling revives dozens of shelved compounds (including many that are generic or plant-based), and discovers off-label uses that prove more effective than blockbuster drugs. Hospitalizations for metabolic side effects fall by 32%, and targeted cancer therapies double remission rates. Regulators, encouraged by these outcomes, begin piloting AI-augmented approval pipelines, allowing early in silico review before human trials.

Whose Twins?

But PharmaSim does not benefit everyone. Its training data is heavily weighted toward European and North American biobanks. A strain of unanticipated side effects to PharmaSim-discovered compounds occur in clinical trials in South Asia, which were not seen in computational toxicology simulations. In Africa and South America, repurposed compounds identified by PharmaSim that showed efficacy in treating disease in Switzerland were largely ineffective in local populations.

The Future of Medicine

In response, a philanthropic consortium establishes funding for a global moonshot in biobanking for the global south. India and Brazil also launch rival platforms using open clinical and biospecimen data, establishing alternative AI pharmacology standards.

Despite initial frictions, PharmaSim ushers in a new era of truly personalized medicine. As global regulators coalesce around shared data standards and equity frameworks, a new generation of digital twin tools emerges that are capable of finer-scale recommendations and simulations. With them comes a future where healthcare isn’t just personal, but bespoke.


BACK TO SCENARIOS