discord @ rank.dim | email @ request
the pretraining objective is "predict the next token", but the post-training objective is closer to "create a response that is correct, properly formatted, and in line with style+safety"
For example, we might predict that it's very difficult to have an LLM emit the phrase "the quick brown fox jumps over the lazy cvnpmnzq", but it's trivial
the pretraining objective is "predict the next token", but the post-training objective is closer to "create a response that is correct, properly formatted, and in line with style+safety"
> New blog post: "The Browser Sensorium"
gracekind.net/blog/browser...
im not cooked but deep fried
im not cooked but deep fried