Portfolio · Computing Lineage
Floor 9 — Word2Vec: The Field Touches the Manifold and Flinches
In 2013, language became geometric for the first time at industrial scale. The vectors were real. The manifold under them was never quite named.
The floor
In 2013, Tomáš Mikolov and colleagues at Google published Efficient Estimation of Word Representations in Vector Space. The paper introduced word2vec — a simple method for learning, from raw text alone, a 300-dimensional vector for every word in a vocabulary. Words used in similar contexts ended up with similar vectors. This had been tried before. It had never worked at this scale.
And then the example that went viral:
king − man + woman ≈ queen
Arithmetic. On words. As if meaning had quietly always been addition and subtraction of directions in a high-dimensional room. The field, and the popular press, lost their minds. The word-embedding era had begun.
For a moment, everyone working in NLP believed the geometry of meaning was about to be delivered to them on a plate.
What was picked
Words as fixed vectors in a flat Euclidean space, compared with cosine similarity. A vocabulary of 100,000 words becomes a 100,000-point cloud in 300 dimensions. "Nearby" means "used similarly." That is all.
The choice looks continuous — vectors, not categories — and for the first time, genuinely, it is. You can take the midpoint between two words and ask what lives there. You can project the cloud onto two axes and see cat and dog and kitten form a family. This was not available before 2013 at this resolution.
And yet, look closer. The geometry is flat. Every direction in the space has the same interpretation — or rather, no designed interpretation at all. Which axis is emotional temperature? Which axis is formality? Which axis is uncertainty? Nobody knows. The structure is accidental, discovered after the fact by probing, not designed in.
The field had finally given words positions, but had given the manifold no shape.
What could have been picked
A space in which each dimension means something specific. A dimension for curiosity. A dimension for tension. A dimension for temporal tense. A dimension for politeness. A dimension for commitment. The coordinates are interpretable at inspection time; the word's position on each axis is set by how it is used, but the axes themselves are given meanings we choose.
This is what linguistics had been quietly working on for decades. Componential semantics. Charles Osgood's semantic differential (1957), which measured every word on just three axes — evaluation, potency, activity — and accounted for a startling amount of human meaning. Roget's Thesaurus was, in effect, a hierarchical manifold drawn by hand.
A world in which word2vec had been built on a designed manifold — the axes named, the basins known, the gradients chosen — would be a world in which the vectors did not just accidentally encode meaning. They would be maps of it. You could walk them.
What we missed
Interpretability. Almost entirely.
For a decade now, the field has been pouring research into the problem of what directions in embedding space mean. Teams of brilliant people probing, clustering, visualising, reverse-engineering, hoping to find interpretable axes inside a space whose axes were never given meanings to begin with.
If the axes had been chosen at design time — if the manifold had been given a shape — none of that archaeology would be necessary. The direction labelled uncertainty would just be uncertainty.
The field got the vectors. It flinched at the manifold. In the alternate timeline, word2vec's successors were not embedding spaces you probe afterwards. They were named spaces you design first.
What the next floor will ask
If the interior is a geometric space, what does an architecture that lives in it look like?
That's Floor 10. It's the one that is running right now, and it is the closest and the furthest at the same time.