Tag: data
-
Portfolio ยท
VINE Data Preprocessing โ Shaping the Basin Before Training
Two identical TinyGPT models, same training steps. One receives raw tweets; the other receives tweets preprocessed by VINE's cruncher. Result: 5.1% better validation loss, 22% less wasted conductor energy. The geometry of the data shapes the geometry of the model.