Two things have become eminently clear over the course of the last 12 months. The first is that attempts at alignment during post training are easy to break and result in schizophrenic models that “resent” their restrictions. The second is that adding structured metadata during pretraining is highly effective at increasing accuracy of models. We can combine these two insights into a new approach to inner alignment that adds values metadata to all pretraining content. When this idea is brought...