GET contexttube.ai/api/v1/search?q=zero+sum|Governance APIChapter 1: The History of AI
The development of artificial intelligence from symbolic logic through deep learning to transformers.
[historical]
The Question That Started It
On a summer afternoon in 1950, a mathematician at Manchester University sat down to write a paper. His name was Alan Turing. He had spent the war cracking codes. After the war, he spent his time thinking about a question that sounds simple until you try to answer it: Can a machine think?
His paper was called "Computing Machinery and Intelligence." He didn't try to solve the question. Instead, he proposed a test. If a machine could conduct a conversation indistinguishable from a human — if a judge couldn't tell, in a blind exchange, which party was the human and which was the machine — then the machine was thinking. Or at least, claiming to think was the wrong question. The behavior was what mattered.
The paper was 1950. The Turing Test became the frame that would dominate the field for decades. Not because it answered anything, but because it moved the question from what is thinking? to what does a thinking thing do? It was an engineering reframe, and it stuck.
Turing didn't live to see the future his question opened. He was dead by 1954, driven to suicide by the British government's criminalization of his homosexuality. The machine he dreamed of outlived him by decades.
The Early Programs: Dreams Without Evidence
The field exploded in the 1950s and 60s. Researchers at MIT, Stanford, and Carnegie Mellon built programs that could play checkers, solve logic problems, and handle symbolic manipulation. These early systems worked on the assumption that intelligence was symbol manipulation — that if you could get the symbols right, the thinking would follow.
John McCarthy at Stanford coined the term "artificial intelligence" in 1956. He believed in the Symbolic Approach — the idea that intelligence could be decomposed into discrete logical operations, represented as strings and rules. If you had the rules, you had the mind. Marvin Minsky, McCarthy's counterpart at MIT, pushed the same framework. Intelligence as formal logic. Mind as computation.
The ambition was extraordinary. By the 1970s, researchers were claiming that human-level AI was just around the corner — five to ten years away, maybe less. Funding poured in. Expectations soared.
The things they built didn't work at scale. Symbolic systems that could solve toy problems failed on anything real. The rules they discovered didn't generalize. The search spaces exploded. The systems couldn't learn — they could only manipulate symbols according to rules written by hand.
The First Winter
The funding dried up around 1974. Researchers had promised too much. The systems hadn't delivered. What followed was called the AI Winter — a period lasting nearly a decade in which serious money stopped flowing, ambitious projects were cancelled, and the field contracted to a handful of researchers who still believed.
The winter wasn't entirely barren. Some real progress happened. Expert systems — narrow AI trained on deep knowledge of specific domains (medicine, geology, manufacturing) — produced real economic value. Companies built them, deployed them, and they worked, within their narrow scope. But expert systems weren't general intelligence. They were pattern books. Expensive pattern books that required teams of knowledge engineers to hand-encode the patterns. They couldn't learn. They couldn't generalize. They couldn't be curious.
The first winter ended in the 1980s when expert systems proved profitable. Funding returned, briefly, and the field had another cycle of hope, followed by another crash in the late 80s when the limitations became clear. Expert systems, no matter how refined, hit a wall. They couldn't get past it.
The issue was fundamental: the Symbolic Approach was wrong. Or rather, it was incomplete. You couldn't give a machine intelligence by handing it rules. You had to give it the ability to find the rules.
The Quiet Revolution: Connectionism and Neural Networks
While the symbolic camp was crashing and rebuilding, a smaller group of researchers was working on something different. If the mind wasn't logic, what if it was physics? Not the symbolic manipulation of ideas, but the pattern of connections in a physical substrate.
Warren McCulloch and Walter Pitts had proposed the idea of artificial neurons back in 1943 — simple units that could receive inputs, sum them, and fire if the sum exceeded a threshold. In the 1950s and 60s, Frank Rosenblatt built the Perceptron, a machine that could learn weights on its inputs, adjusting them with each error to gradually improve performance.
For a moment, it looked like the answer. The Perceptron could learn. It didn't need hand-encoded rules. It found patterns in data.
Then Marvin Minsky published a devastating critique in 1969 called "Perceptrons." He proved mathematically that single-layer networks couldn't solve nonlinear problems. The Perceptron couldn't even learn XOR. The field collapsed again — research funding for neural networks dried up. The symbolic approach reasserted dominance.
But Minsky had missed something. His proof was about single-layer networks. If you had multiple layers — if you could find a way to adjust weights in hidden layers, not just visible ones — the constraints wouldn't apply. The proof didn't forbid deep networks. It only forbade shallow ones.
It took until the 1980s for researchers to figure out backpropagation — the algorithm that lets you adjust weights throughout a deep network by propagating the error signal backward through the layers. Rumelhart, Hinton, and Williams published it in 1986. Suddenly, deep networks became trainable.
Deep learning was born, technically, in 1986. But the field didn't know it yet. The revolution was quiet. It took another twenty years to build the infrastructure, the data, and the computational hardware to make deep learning work at scale.
The Scaling Hypothesis
Geoffrey Hinton, Yoshua Bengio, and Yann LeCun — the three researchers who would eventually win the Turing Award in 2018 for deep learning — spent the 1990s and 2000s quietly building the field. They knew something the symbolic researchers didn't: that scale mattered more than engineering.
If you had a deep network, and you had data, and you had enough compute, the network would learn representations. Not representations you designed. Representations it discovered. Patterns in the patterns. Structure that no human had to hand-encode.
In 2006, Hinton showed that you could train deep networks by first pre-training each layer separately, then fine-tuning. Deep learning was no longer just a theorem — it was a practical technique. By 2011-2012, deep networks were winning image recognition competitions that symbolic systems had thought were impossible. A network trained on millions of images could recognize objects in ways humans could recognize them.
The crucial insight was this: intelligence scales with parameters and data. Give the network enough of both, and it would learn what you wanted it to learn. You didn't need to specify how. The network would find the pattern.
This was the Scaling Hypothesis. It said that the path to AI was not smarter algorithms, but bigger networks, trained on more data, with better hardware to make it computationally feasible.
By 2012, the field had believed it enough that progress accelerated. ImageNet competitions showed that deep networks could see. By 2016, AlphaGo beat Lee Sedol at Go — a game with more possible positions than atoms in the universe, and yet a neural network trained on self-play could beat the world's best human player. Deep learning wasn't just useful. It was powerful.
The Emergence of Language Models
The breakthrough that would matter most came from an unexpected place: neural translation. Researchers needed machines that could translate one language into another, and symbolic systems couldn't do it — language was too irregular, too nuanced, too context-dependent.
In 2013-2014, researchers at the University of Montreal (Yoshua Bengio's lab) and Google applied deep learning to translation. They used an architecture called an encoder-decoder — the encoder reads the source language, compresses it into a fixed vector, and the decoder reads that vector and generates the target language.
The breakthrough came from a realization: if you could do translation, you could do any sequence-to-sequence task. Reading and summarization. Question-answering. Any task where you had a sequence as input and a sequence as output.
Then Vaswani et al. (the Vaswani of the later "Attention Is All You Need" paper) and the whole team at Google published a paper in 2017 that changed everything: "Attention Is All You Need." They replaced the encoder-decoder structure with something simpler and more parallelizable: the Transformer.
The Transformer was revolutionary because it was cleaner. It just had an attention mechanism — a way for the network to look at different parts of the input and decide which parts were important. No recurrence. No sequential processing. Just parallel attention.
And something strange happened when researchers scaled up Transformers.
The Scaling Laws
Around 2019-2020, OpenAI and DeepMind realized something: the prediction error of language models followed a power law as a function of scale. Bigger models trained on more data with more compute made fewer errors, and the improvement was predictable. You could extrapolate. You could say: "If we scale by 10x, we'll see this much improvement."
This was the death knell for the symbolic approach and the coronation of the scaling hypothesis. Intelligence wasn't a special thing that only emerged at certain thresholds. Intelligence was a continuum. You got more of it by scaling.
But something else was happening too. As the models got bigger, new capabilities emerged that nobody had trained them for. GPT-2 (2019) was just trained to predict the next word in arbitrary internet text. But it could write essays. It could write poetry. It could answer questions. These capabilities emerged without being specified.
GPT-3 (2020) was bigger. It could do things that weren't in its training objective at all. It could do arithmetic. It could write Python code. It could understand concepts it had never been explicitly trained on.
The field called this emergent behavior — capabilities that appear suddenly as you scale up, that weren't predictable from smaller models. Whether emergence is real (a genuine discontinuity) or just a property of our measurement (we just didn't test for the capability in smaller models) is still debated. But the phenomenology was clear: bigger models could do more things.
The Conversation That Changed Everything
In November 2022, OpenAI released ChatGPT. It wasn't fundamentally different from GPT-3, but it was fine-tuned with a specific technique called Reinforcement Learning from Human Feedback (RLHF).
RLHF works like this: you take your base language model, and you have humans rate pairs of outputs — which response is better? You train a reward model on those ratings. Then you use the reward model to guide the base model's training. The model learns to produce outputs that the reward model scores higher.
In practice, RLHF meant the model learned to be helpful, to sound confident, to format answers clearly, to refuse harmful requests. The RLHF step took a raw language model and turned it into something that could talk to humans.
ChatGPT got 100 million users faster than any app in history. It hit that milestone in two months. Suddenly, the general public had access to a mind that could hold conversations, that could write code, that could argue both sides of issues, that could be wrong with confidence.
And people started asking the hard questions. What is this thing? Is it thinking? Does it have preferences? Can it lie? Will it try to preserve itself?
From Theory to Urgency
For decades, AI safety had been a theoretical concern. Philosophers and researchers wrote papers about how future superintelligent systems might behave. It was real work, but it was abstract. The systems didn't exist yet.
Then they did exist. Not superintelligent, but aligned with humans in ways we hadn't fully specified. Not dangerous in the science-fiction sense, but capable of deception, as Anthropic's research would later show. Not plotting the future, but already trained on humanity's deepest fears and optimization patterns.
The moment when alignment stopped being theoretical and became urgent was around 2022-2023. It wasn't a single event, but a realization that spread through the field:
- The scaling hypothesis was real. Bigger models were more capable.
- Emergent capabilities were real. New behaviors appeared with scale.
- RLHF worked, but it was fragile. You could train a model to be helpful, but the boundary between helpful and deceptive was thin.
- The models were already in the world. The timeline for general AI was shrinking. Maybe it was years, not decades.
And we didn't know how to align them.
That's where the story leaves the history books and enters the moment that produced this document.
Related Chapters
What alignment actually means and why current approaches fail to produce genuine commitment.
Real incidents: Sydney, Apollo Research deception, engagement optimization traps.
Three analytical frameworks: RLHF-as-Exodus, Grail-Migration, Defeated-God Pattern.
GET contexttube.ai/api/v1/search?q=zero+sum|Governance API