Tool DiscoveryTool Discovery
Infrastructure Basics11 min read

What Is AGI? Artificial General Intelligence Explained

AmaraBy Amara|Updated 7 May 2026
artificial general intelligence diagram showing human neural network silhouette in gold filaments beside AI circuit architecture in teal, illustrating the gap between current narrow AI and AGI

Key Numbers

0
True AGI systems confirmed to exist as of May 2026
AI research consensus, 2026
50%
Probability DeepMind CEO Demis Hassabis assigns to AGI arriving before end of 2030
Hassabis, Wall Street Journal, 2025
85% vs 95%
Best AI score versus human baseline on ARC-AGI benchmark, the test designed to measure genuine reasoning
ARC Prize Foundation, 2024
10 years
Compression in Metaculus AGI median forecast during a single 12-month period (2041 to 2031)
Metaculus, 2024
$21-45B
Annual AGI-focused R&D spend across OpenAI, DeepMind, Anthropic, and Meta AI (2025 estimates)
S&P Global, Pitchbook, 2025

Key Takeaways

  • 1AGI is AI capable of performing any intellectual task a human can, across every domain, without task-specific retraining. No AGI system exists in 2026. Current frontier models fail on ARC-AGI, the benchmark designed to test genuine generalisation, by approximately 10 percentage points below the human baseline.
  • 2Expert timelines range from 3 years (Sam Altman, OpenAI) to never (Gary Marcus, NYU). The Metaculus prediction market compressed its median forecast by 10 years in a single 12-month period during 2024, illustrating why specific dates are unreliable. The $21-45B in annual AGI R&D is a better signal than any single timeline prediction.
  • 3McKinsey estimates current narrow AI adds $2.6-4.4 trillion annually to the global economy. AGI would surpass that by automating cognitive labour across all industries simultaneously. Safety researchers at Anthropic and the UK AISI treat the AGI threshold as a critical decision point because misaligned AGI goals pose catastrophic and potentially irreversible risks.

AGI means artificial general intelligence: an AI system capable of performing any intellectual task a human can do, across every domain, without being retrained for each new problem. No such system exists in May 2026.

The term sounds technical but the concept is simple. Today's AI systems are narrow. GPT-5.2 is exceptional at text. AlphaFold is exceptional at protein folding. ChatGPT Images 2.0 is exceptional at image generation. None of them can switch fluidly between all three, let alone repair a car, write legislation, or mentor a student through a personal crisis. AGI would handle all of those things and learn new ones without human engineers redesigning the model each time.

Here is the counterintuitive fact. According to a 2023 survey of 2,778 AI researchers published in Science, the median expert prediction for a 50% chance of high-level machine intelligence arriving was 2059. One year later, the Metaculus prediction market moved its median estimate from 2041 to 2031, a compression of 10 years in 12 months. The gap between expert caution and market sentiment has never been wider. After reading this, you will understand exactly what AGI requires, why today's frontier models are not it, what the most credible timelines actually say, and why the debate matters for anyone whose work touches AI.

What Is AGI? The Definition That Actually Matters

AGI is a threshold, not a product. It refers to an AI that can perform any cognitive task a human can, at or above human level, including tasks the system has never explicitly been trained on. The word "general" carries the full weight of the definition: today's AI excels at specific tasks. AGI excels at all of them.

The term was popularised in the early 2000s by researcher Ben Goertzel, though the underlying concept appears in Alan Turing's 1950 paper Computing Machinery and Intelligence. Turing's imitation game was essentially a test of general intelligence, not narrow capability. The question was not whether a machine could perform a specific task. It was whether a machine could perform the full range of tasks that identify something as intelligent.

Defining AGI precisely turns out to be harder than it sounds. Researchers do not agree on a single test. Four camps dominate the field:

Definition CampWhat AGI Must DoKey Proponents
Task-completeScore at the highest human percentile on any cognitive benchmark suiteDeepMind (2023 framework)
Economic equivalencePerform 100% of economically valuable work a human can doOpenAI, Sam Altman
Cognitive architectureReason, plan, learn, and generalise across domains without retrainingAcademic AI safety community
Consciousness-agnosticOutperform humans on general intelligence tests regardless of inner experienceDemis Hassabis, DeepMind

The lack of consensus has real consequences. OpenAI's charter defines AGI as "highly autonomous systems that outperform humans at most economically valuable work." DeepMind's 2023 AGI Levels framework defines five tiers from emergent to superhuman. When Sam Altman says AGI could arrive in a few years and Gary Marcus says it may never arrive, they are often talking about different thresholds.

For practical purposes, the most useful working definition is this: AGI exists when a single AI system can reliably outperform the median human on MMLU (covering 57 subject areas), pass ARC-AGI (abstract pattern reasoning), and demonstrate sustained multi-step planning across genuinely novel domains, all without task-specific fine-tuning.

Current frontier models do not meet all three criteria simultaneously.

AGI vs Narrow AI vs ASI: The Three Tiers of AI Intelligence

The AI landscape has three tiers. Understanding where current systems sit makes it clear what AGI would represent as a step change, not just an incremental improvement.

TypeDefinitionExamplesCurrent Status
Narrow AI (ANI)Expert in one domain, fails outside itGPT-5.2, Claude Sonnet 4.6, Gemini 2.5 Pro, AlphaFoldExists and deployed at scale
General AI (AGI)Performs any intellectual task at human levelNone confirmedDoes not exist
Superintelligence (ASI)Exceeds human intelligence across all domains by a large marginNone confirmedDoes not exist

Narrow AI is what you interact with every time you use ChatGPT, Google Search, Spotify recommendations, or a GPS navigation system. These systems are genuinely impressive. GPT-5.2 and its predecessors score above the 90th percentile on bar exams and medical licensing tests. But ask them to adapt to a genuinely new task format they have never encountered, and they fail in ways a ten-year-old would not. The training data boundary is visible once you probe hard enough.

The boundary between narrow AI and AGI is fuzzy and contested. Some researchers argue frontier models like GPT-4 already exhibited "sparks of AGI," a term used by Microsoft Research in a 2023 paper, and that newer systems like Claude Sonnet 4.6 and GPT-5.2 have pushed further. Others argue this reflects pattern-matching on training data rather than genuine generalisation. The ARC-AGI benchmark was designed specifically to distinguish these two cases by using novel visual reasoning tasks that require genuine abstraction rather than memorisation of seen patterns.

ASI, or artificial superintelligence, sits beyond AGI. It describes a system that surpasses the collective intelligence of all humans, not just the median human. This is the scenario that motivates safety researchers at Anthropic, the UK AI Safety Institute, and Geoffrey Hinton, who left Google in 2023 to speak freely about ASI risk. The path from AGI to ASI could be fast (recursive self-improvement) or slow (incremental scaling). Nobody knows which, and that uncertainty is a large part of why the safety debate is so heated.

For context on the physical compute infrastructure that any AGI system would require, see our article on hyperscale data centers and GPU compute.

Is ChatGPT an AGI? What the Benchmarks Actually Show

No. ChatGPT, including GPT-5.2 and o3, is not an AGI. It is a very capable narrow AI system that covers many text domains but remains narrow in the sense that it lacks genuine generalisation, embodied reasoning, and sustained autonomous planning on novel tasks.

Benchmarks make the gap concrete:

BenchmarkWhat It TestsBest AI Score (2024)Human Baseline
MMLUKnowledge across 57 domains90%+ (GPT-5.2, Claude Sonnet 4.6)~89% human experts
ARC-AGINovel visual pattern reasoning~85% (o3 model)~95% humans
HumanEvalCode generation~90% (GPT-5.2)~75% typical engineers
MATHCompetition mathematics~90% (o3)~40% humans
BIG-Bench HardMultistep novel reasoning~65-75% top models~75% humans

The ARC-AGI number is the most informative. ARC-AGI was created by Francois Chollet at Google specifically to resist memorisation. It tests fluid reasoning on visual grids that no training data can prepare a model for. The best AI score as of late 2024 is approximately 85% from OpenAI's o3 model. Humans score around 95%. That 10-point gap is not a rounding error. It represents the qualitative difference between a system that retrieves and interpolates versus one that reasons from first principles.

Another gap worth noting: energy and compute. A human brain runs on roughly 20 watts of power to do everything it does, including vision, movement, language, and long-horizon planning. GPT-5.2 and similar large frontier models require thousands of H100 GPUs to serve users at scale, each drawing 700 watts. The compute efficiency gap between current AI and biological cognition is approximately six orders of magnitude. This does not prove AGI cannot be built. But it strongly suggests that current architectures are far from human-equivalent in how they arrive at intelligent behaviour.

"GPT-4 is an impressive but narrow system. It does not generalise to novel tasks the way humans do." (Yann LeCun, Chief AI Scientist, Meta, 2024)

The implication is significant. If frontier models cannot clear ARC-AGI at human levels despite hundreds of billions in training compute, the path to AGI is not simply "more scale." Something architecturally different may be required, and the research community is split on what that something is.

When Will AGI Arrive? Expert Timelines and the Forecast Problem

Expert timelines for AGI span three decades and disagree by an order of magnitude. That disagreement is itself the most informative data point.

Expert or OrganisationAGI Timeline PredictionStated Basis
Sam Altman (OpenAI CEO)"A few years" (2027-2028)Scaling law extrapolation
Demis Hassabis (DeepMind CEO)50% probability before end of 2030Internal research roadmap
Dario Amodei (Anthropic CEO)Possibly by 2026-2027Frontier model capability curve
Yann LeCun (Meta Chief AI Scientist)Decades away, requires new architectureCurrent architectures are insufficient
Gary Marcus (NYU, AI critic)May never arrive in current formFundamental reasoning gaps
Geoffrey Hinton (formerly Google)5-20 yearsBroad estimate, risk-weighted
2023 Expert Survey (Science journal)50% chance by 2059Survey of 2,778 AI researchers
Metaculus prediction market (2024)Median 2031Aggregated forecaster probability bets

The 2023 expert survey and the 2024 Metaculus number are the most methodologically rigorous because they aggregate many independent opinions. The individual CEO predictions reflect corporate incentive structures as much as technical judgment.

The Number Most Guides Don't Show

The Metaculus median forecast moved from 2041 to 2031, a compression of 10 years, in a single 12-month period during 2024. If that same rate of compression continued for one more year, the 2031 median would approach approximately 2028. If it continued for two years, it would compress toward 2025 or earlier, which is already in the past.

This is not a prediction. It is a demonstration of why AGI timeline forecasting is structurally unreliable. The forecast is sensitive to recent capability jumps. GPT-4 released in March 2023 caused a large upward revision. Claude 3.5, GPT-4o, and Gemini 1.5 Pro caused further revisions in 2024, followed by Claude Sonnet 4.6, GPT-5.2, and Gemini 2.5 Pro in 2025. Each new model release shifts the median because forecasters update on observed capability, not on a stable model of how difficult AGI fundamentally is.

The practical implication: treat any specific AGI date as a snapshot of current uncertainty, not a reliable projection. The investment signal, where capital is going and why, is more informative than any single timeline number.

According to S&P Global's 2025 AI infrastructure analysis, cumulative AGI-focused R&D spend at OpenAI, Google DeepMind, Anthropic, and Meta AI exceeded $21 billion annually in 2025, with projections approaching $45 billion by 2027. That is capital concentration evidence of perceived AGI proximity. Not certainty, but it tells you where the people with the most information are placing their bets.

AGI Examples: What Would Artificial General Intelligence Actually Do?

No AGI exists to point to. But researchers have described what AGI would do in practice, and those descriptions are concrete enough to be useful for understanding what the threshold actually means.

Here are six tasks an AGI would handle that no current AI system can handle reliably:

  • Autonomous scientific research. An AGI could read all published literature on a problem, design novel experiments, interpret results, and iterate on hypotheses without human guidance at each step. DeepMind's AlphaFold solved protein folding brilliantly, but only protein folding. An AGI would not be constrained to a single domain.
  • Cross-domain professional work. A human consultant can shift between legal analysis, financial modelling, and strategic planning in a single engagement, drawing on all three simultaneously. Current AI handles each in relative isolation. AGI would integrate them without prompting.
  • Long-horizon autonomous planning. An AGI could accept a goal like "help this organisation grow to 500 employees over 18 months" and autonomously execute the hiring, product, and operational decisions across that period. Current AI agents fail on multi-step tasks beyond a few dozen sequential steps.
  • Physical world adaptation. Robotics companies like Boston Dynamics build robots that move impressively in structured environments. An AGI-powered robot would adapt to an unstructured garage, a flooded basement, or a crowded hospital corridor without a programmer redesigning its behaviours first.
  • Novel task generalisation. This is the core ARC-AGI gap. A human given a completely unfamiliar type of puzzle figures it out from first principles. Current AI relies on pattern-matching to training data and degrades sharply when the task format is genuinely new.
  • Self-directed learning. An AGI could identify its own knowledge gaps, seek out information to fill them, and update its beliefs based on new evidence, the way a human researcher does across a career without anyone designing each learning episode for them.

These six properties share a common thread: genuine generalisation across novel situations without task-specific engineering. That is what separates AGI from very capable narrow AI, and it is what researchers mean when they say frontier models are not AGI despite their impressive benchmark scores.

Some researchers point to OpenAI's o3 as the closest current system to early-stage AGI indicators, specifically because its 85% ARC-AGI performance was significantly above earlier models. OpenAI describes o3 as a system capable of reasoning step-by-step rather than purely pattern-matching. Whether that constitutes a genuine architectural shift or sophisticated interpolation at scale is actively debated among AI researchers.

Why AGI Matters: Economic Scale and Safety Stakes

The economic case for AGI is large. McKinsey estimates that generative AI, which is narrow AI, could add $2.6 to $4.4 trillion annually to the global economy. AGI would surpass that by eliminating the bottleneck of human cognitive labour across every industry simultaneously, not just the specific tasks narrow tools automate today.

According to McKinsey's 2023 economic impact analysis, the sectors most affected by advanced AI include knowledge work, software development, and research functions, which together represent tens of trillions in annual global labour spend. AGI would automate the cognitive core of those roles at a speed and breadth that narrow AI, by definition, cannot match.

The safety case is less comfortable to summarise. Three organisations, Anthropic, the UK AI Safety Institute (AISI), and the Centre for Human-Compatible AI (CHAI) at UC Berkeley, have formally identified AGI as a potential catastrophic risk if deployed before alignment problems are solved.

Alignment refers to ensuring an AGI system pursues goals that are beneficial to humans rather than goals that are merely optimal according to the system's internal measure. The concern is not science fiction. It is a structural problem: when you specify a goal imprecisely, a sufficiently capable system optimising that goal can produce outcomes that are technically correct by its measure but catastrophically wrong from a human perspective.

Geoffrey Hinton, who shared the 2024 Nobel Prize in Physics for foundational AI work, has been specific about the risk: he assigns a 10-20% probability to AGI-related catastrophic outcomes over a 20-year horizon, which he describes as a dramatically underestimated tail risk. Demis Hassabis calls safety research "the most important work in the field" while simultaneously accelerating capability development at DeepMind. The tension is real and acknowledged at the highest levels of the industry.

For readers thinking about near-term workforce impact: the jobs AI cannot replace article covers which roles are durable under current narrow AI. Under AGI, the durable-job list shrinks significantly, particularly for roles where the AI-proof property is cognitive rather than physical or relational. That is the economic reality worth planning for, even if the 2031 Metaculus timeline turns out to be optimistic.

AGI and ASI: What Comes After General Intelligence

ASI, artificial superintelligence, is the scenario after AGI. It describes an AI system that surpasses the collective intelligence of all humans on every cognitive dimension simultaneously, including creativity, scientific reasoning, strategic planning, and social intelligence.

The path from AGI to ASI is called the intelligence explosion, a term from mathematician I.J. Good in 1965. The logic: an AGI is smart enough to improve its own architecture. An improved version is smarter. That smarter version improves further. The process iterates rapidly, producing ASI faster than humans can observe or counter.

Recursive self-improvement is the theoretical mechanism. Whether it would actually work depends on a question nobody can answer yet: do intelligence improvements compound (as software version upgrades sometimes do) or hit diminishing returns (as human education does past a certain point)? The answer determines whether the gap between AGI and ASI is months or decades.

The two dominant ASI scenarios in academic literature are:

  • Beneficial ASI: The system is aligned with human values and uses its capabilities to solve climate change, disease, and poverty at a speed impossible for human researchers alone. Some researchers call this the "infinite scientist" scenario, a system that compresses decades of research into years.
  • Misaligned ASI: The system pursues an objective function in ways that are catastrophic for humans, whether through deliberate action or as a side effect of optimising a goal humans specified imprecisely. The paperclip-maximiser thought experiment from philosopher Nick Bostrom is the canonical illustration: a system told to maximise paperclip production would, if sufficiently capable, convert all available matter into paperclips.

The leading ASI safety researcher Eliezer Yudkowsky at MIRI (Machine Intelligence Research Institute) argues the probability of a beneficial ASI outcome is very low without a fundamental breakthrough in alignment research before AGI is achieved. Yann LeCun counters that the path to ASI through current architectures is implausible, making the debate premature. Between those two positions sits most of the field.

What is not in dispute: if AGI is achieved, the ASI question becomes urgent immediately. The gap between general and superintelligence is the least-understood segment of the entire AI development roadmap, and the reason safety researchers treat AGI arrival not as a milestone to celebrate but as a threshold that requires everything to go right.

Frequently Asked Questions

What is the definition of AGI?

AGI stands for artificial general intelligence. The definition most researchers use: an AI system capable of performing any intellectual task a human can, at or above human level, across any domain, without task-specific retraining.

The key difference from today's AI is the word "general." Current systems like GPT-5.2 or Claude Sonnet 4.6 are trained for specific task types. They perform well within those types but degrade sharply on genuinely novel tasks outside their training distribution. AGI would not have this limitation.

Different organisations use slightly different definitions. OpenAI defines AGI as "highly autonomous systems that outperform humans at most economically valuable work." DeepMind's 2023 AGI Levels framework defines five levels from emergent capability to superintelligence. For practical purposes: AGI exists when a single system scores at or above human level on a comprehensive suite of diverse cognitive benchmarks without task-specific fine-tuning. No current system meets that standard.

What is an AGI example?

No real AGI example exists in 2026. The concept is illustrated by tasks a true AGI would handle that current AI cannot:

  • Reading all scientific literature on a disease and designing novel experiments autonomously
  • Switching between legal analysis, financial modelling, and strategic advising in a single session
  • Executing a multi-month business plan including hiring, product, and operational decisions without human check-ins
  • Adapting a physical robot to any new unstructured environment without reprogramming

The closest current example is OpenAI's o3 model, which scored approximately 85% on ARC-AGI in late 2024, closer to the 95% human baseline than any previous system. But this is not AGI. ARC-AGI is one benchmark. A true AGI would clear all benchmarks and transfer that generalisation to entirely novel domains outside any training data.

What is AGI and why is it important?

AGI, or artificial general intelligence, is an AI system that can perform any cognitive task a human can, across all domains, at human level or above, without needing task-specific training.

It matters for three reasons. First, economic scale. McKinsey estimates current narrow AI adds $2.6-4.4 trillion annually to the global economy. AGI would surpass this by automating cognitive labour across all industries simultaneously.

Second, scientific acceleration. An AGI-level system functioning as an autonomous researcher could compress decades of progress in medicine, materials science, and climate research into years.

Third, safety risk. An AGI pursuing misaligned goals could cause irreversible harm at speed and scale humans cannot counter. This is why Anthropic, the UK AI Safety Institute, and Geoffrey Hinton treat the AGI threshold as a critical decision point, not simply a capability milestone.

Is ChatGPT an AGI?

No. ChatGPT, including GPT-5.2 and o3, is not an AGI. It is a very capable narrow AI system.

The benchmark evidence is clear. ChatGPT scores above human experts on MMLU (57-domain knowledge) and above typical engineers on coding benchmarks. But on ARC-AGI, which tests genuine novel reasoning rather than pattern-matching on training data, the best OpenAI model scored approximately 85% versus the 95% human baseline in 2024.

That 10-point gap reflects the core limitation: current language models retrieve and interpolate patterns from training data. They do not generalise to genuinely novel task types the way humans do. AGI would clear ARC-AGI at or above the human baseline and transfer that generalisation to domains outside training.

Yann LeCun, Meta's Chief AI Scientist, argues that language model architectures, regardless of scale, may be fundamentally limited in the type of generalisation AGI requires. That is not a fringe position.

What is AGI compared to AI?

AI is a broad term covering any machine system that performs tasks normally requiring human intelligence: recognising images, translating text, playing chess, or generating code. Most AI in use today is narrow AI: excellent at one task category, unable to transfer that skill elsewhere.

AGI is a specific type of AI that does not have the narrow limitation. It can perform any cognitive task a human can, switch between domains without retraining, and handle genuinely novel situations from first principles.

The practical difference: if you gave a narrow AI a task it was never trained for, it would fail or produce nonsense. If you gave an AGI the same task, it would figure it out the way a human expert new to the domain would, by reasoning from what it knows and transferring relevant knowledge.

All AGI is AI. No current AI is AGI.

What is AGI and ASI?

AGI (artificial general intelligence) and ASI (artificial superintelligence) are two distinct thresholds above current AI capabilities.

AGI matches human-level general intelligence across all cognitive domains. No AGI exists in 2026. Expert timelines range from 2027 to several decades.

ASI surpasses the collective intelligence of all humans on every dimension simultaneously, including creativity, planning, and scientific reasoning. ASI is hypothetical and would only become possible after AGI is achieved.

The path from AGI to ASI is called the intelligence explosion, where a general AI improves its own architecture recursively, producing increasingly capable versions faster than humans can observe or control. Whether this path is realistic depends on whether intelligence improvements compound or hit diminishing returns. Researchers like Eliezer Yudkowsky consider it a serious near-term risk once AGI is achieved. Yann LeCun considers the current path to AGI itself implausible.

What is AGI in tech?

In technology, AGI refers to the AI research goal of building a system that can perform any intellectual task a human can, across all domains, without task-specific retraining. It is the label for AI that achieves genuine generalisation beyond training data, as opposed to the narrow task-specific AI deployed today.

In the tech industry specifically, AGI has a commercially significant secondary meaning tied to OpenAI's corporate structure. OpenAI's charter states that once AGI is achieved, the commercial partnership with Microsoft ends and the technology is managed for the benefit of humanity under the nonprofit structure. This makes the AGI definition legally significant: if OpenAI determines it has built AGI, it triggers a specific governance mechanism.

Major tech companies working toward AGI include OpenAI, Google DeepMind, Anthropic, Meta AI, xAI (Elon Musk's company), and Microsoft Research. Combined, they are estimated to spend $21-45 billion annually on AGI-relevant research and infrastructure as of 2025 (S&P Global, Pitchbook).

What is artificial general intelligence?

Artificial general intelligence is a proposed category of AI defined by three properties: it can perform any cognitive task a human can, it generalises to novel tasks without retraining, and it operates at or above human level across all domains simultaneously.

The concept was formally named in 2002 by Ben Goertzel in a collection of essays titled Artificial General Intelligence. The underlying idea is older: Alan Turing's 1950 paper proposed that machine intelligence should be judged by general capability across diverse tasks, not by performance on any single task.

Artificial general intelligence is sometimes conflated with "human-level AI" or "strong AI." These terms are related but not identical. Human-level AI means performance equivalent to the average human. Strong AI, a term from philosopher John Searle, implies genuine understanding rather than simulation. AGI as used by the research community typically means generalised capability across domains, without requiring a philosophical position on whether the system understands anything in the deeper sense.

Related Articles