What’s a Large Language Model?
If you’ve encountered text written by a computer in the past year (you definitely have), it was probably generated by a large language model, AKA an LLM.
LLMs are the AI technology behind Gmail autocomplete, ChatGPT, and LowTech AI, but what does that term actually mean? As these systems get more integrated with our work and lives, it’s worthwhile to understand the basics. Here’s the absolute bare necessities you should know.
Breaking It Down
First off, let’s go through the meaning of each word in “Large Language Models.”
- Large: Size matters. These systems consume massive amounts of data: all of wikipedia, every book ever written, every tweet on the internet, and much more. “Language models” are a class of AI systems, and “Large Language Models” are defined by the sheer volume of data they require to create.
- Language: Their focus is human language. AI systems can be built for everything from generating social media recommendations to weather prediction. LLMs are focused on learning how to understand and generate human language.
- Model: It’s just good at guessing. There is no precisely correct output for these systems, just a prediction of the best output. You can think of these systems as a prediction model for text: given some text input, they’ll give you a really good guess of the best text output.
Amazingly, these models focused on language trained on a large amount of data are able to do incredible things like write convincingly human emails, summarize information, and even do basic reasoning. You’re going to be hearing a lot more about LLMs for the foreseeable future.
The One Sentence Answer
A Large Language Model (LLM) is an AI system that learns from massive amounts of textual data in order to understand and generate human-like language.
The One Paragraph Answer
Large Language Models are trained on vast amounts of data encompassing the entire internet, including resources such as Wikipedia, social media posts like tweets, and virtually every book ever written. This extensive training enables them to absorb text and generate the next most likely or coherent word in a sequence–aka it’s a prediction model for the next word. Initially, these models were engineered with the aim of completing sentences for uses like Gmail autocomplete. However, researchers were taken aback when they discovered that these systems had performed beyond their original design and exhibited an ability to reason and follow instructions. There are countless new potential applications for LLMs.
Why Should I Care?
Large language models are some of the most powerful AI systems ever built. Since December, ChatGPT has been in the spotlight for bringing this technology to the mainstream, and these systems will continue to improve dramatically in the coming years. If you do work that’s on a computer, LLMs will be a part of your life.
At LowTech AI, we’re trying to make LLMs easier to use. In this article we described what they are, but in a separate article I’ll discuss what they can do. ChatGPT was just the start, and we’re excited to help more people reap the benefits of this technology with LowTech AI.