Language model

From Robowaifu Institute of Technology
Revision as of 12:38, 29 December 2022 by RobowaifuDev (talk | contribs) (Created page with "A '''language model''' is a probability distribution over sequences of words, often used to predict or generate text. They are trained on a large corpus of documents, such as books, articles, websites, transcripts and other documents to learn how to predict the next character, token or word in the training data. This prediction can be used to generate new text with new ideas, answer questions, or to simp...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

A language model is a probability distribution over sequences of words, often used to predict or generate text. They are trained on a large corpus of documents, such as books, articles, websites, transcripts and other documents to learn how to predict the next character, token or word in the training data. This prediction can be used to generate new text with new ideas, answer questions, or to simply create text that is similar to the original in style or content.

Language models are important because they enable computers to predict how a sentence will continue, which is needed in natural language processing for analysing text to perceive its meaning and generate meaningful text. This is done under the assumption that being able to predict the next word requires understand the meaning. However, language models only trained on correct examples without a contrastive objective often fail to recognize the difference between true and false statements.