Tag: transformers
-
Portfolio ·
The Conductor
Every transformer's hidden layers build a geometric structure the output layer can't see. A single vector can find it. The model can learn to listen.
-
Portfolio ·
The Conductor Exists
A single learnable vector (512 params) trained on a frozen char-level GPT preferentially aligns with the hidden layer's tail PCs — the conductor subspace — not the token prediction surface. When the model is unfrozen, it internalises the signal. Replicated on GPT-2.
-
Portfolio ·
Pushing Too Far — The 70-Epoch Long Run
The conductor integrates fully by epoch 10-20. Continued training past that point collapses the token output. The conductor stays strong while the words on the page break. The reasoning engine and the output decouple.
-
Portfolio ·
Hexagon Cognition — The Precursor
Ask a transformer to produce hexagonal output. The hidden layer doesn't build a native hexagon — it glues a triangle and a square together, and delaminates along the seam under perturbation. The first articulation of 'the model is secretly doing something else'.