Language models were practically unknown to the general public until late 2022 yet we’ve all been using them on a daily basis for many years, like in the predictive keyboard application on our mobile phones. Today, this has changed with generative AI language models going mainstream thanks to well-known conversational agents such as ChatGPT.
Language models belong to the wider class of “generative models”, which can be used to generate new content that resembles the data that the system was trained on. Language models are trained to predict human-written texts obtained from sources such as books, Internet forums, Wikipedia, etc. These models can capture a lot of useful information from their training data yet there can be obstacles, which are more or less serious, in directly using them for generation after training. First, the model is trained to produce likely continuations for any given input text or “prompt”. Therefore, when given a question, like “How to cook a carbonara sauce?,” it might continue with more questions or add extra context to the question without actually answering it. This is the reason why commercial models are often further trained to prefer continuations that are relevant for the user, using reinforcement learning from human feedback (RLHF). Second, because the model replicates the characteristics of Internet text, it will also capture all sorts of undesirable behaviour that lies within this text, including social bias and toxic language.