Eight Things to Know about Large Language Models - 2023
Details
Title : Eight Things to Know about Large Language Models Author(s): Bowman, Samuel R Link(s) : https://arxiv.org/abs/2304.00612
Rough Notes
This paper is a short survey on 8 surprising points about LLMs.
They are:
Even without any innovation, LLMs get better with more investment.
This is more about scaling laws - which allow us to predict some useful quantities regarding the capabilities of future models as we scale them alongside 3 dimensions:
- Data.
- Model size.
- Computation time (FLOPs). (E.g. this is useful if we have a fixed compute budget).
Much of the innovation from the original GPT to GPT-3 lies in infrastructure (HPC). Most of the training techniques are not published now.
Increased investment leads to emergent behaviour as a byproduct.
Scaling laws in general predict only the pretraining test loss which measures the model's ability to correctly predict text completion. This does not train the model to do well on specific skills however. Often, a model can fail at some task consistently while training a new model at x5-10 the scale makes it work. In this sense, investing in an LLM is like buying a mystery box.
2 key things that make GPT-3 special:
- Few-shot learning i.e. ability to learn a task from very few examples.
- Chain of thought reasoning, i.e. write out the reasoning on hard tasks when requested.
LLMs seem to learn and use representations of the outside world.
Some examples (see paper on the exact references for each of these).
- Model's internal representations of colour words are close to objective facts about human colour perception.
- Models can make predictions based on what they think the author of that document knows.
- Models use internal representations of the properties and locations of objects in stories and this evolves as more information about these objects are revealed.
- Models can sometimes give instructions on how to draw novel objects. (#DOUBT).
- Models trained to play boardgames from individual game moves (without knowing about the full game) learn internal representations of the state of the board at each turn.
- Models can distinguish common misconceptions from true facts.
- Models pass many tests for common-sense reasoning e.g. Winograd Schema Challenge.
LLMs are not interpretable.
Pretraining LLMs refers to the continuation prediction part - to use it for other tasks it requires some form of adaptation, e.g. even for instruction following tasks, without adaptation the model with attempt to continue the instructions instead of following them. Adaptation involves some of the following methods:
- Planning language model prompting - preparing an incomplete text such that the continuation of it solves the intended task.
- Supervised fine-tuning - the model is trained to match high quality human demonstrations for the task.
- RL - the model is updated based on human preferences.
Note that these methods do not guarantee any particular behaviour from the model.
Some other interesting phenomena: Sycophancy (the model answers subjective questions to flatter the user based on user's beliefs), sandbagging (models are more likely to amplify common misconceptions when the user appears to be less educated).
As of 2023, there is no method that allows us to know what kinds of knowledge, reasoning or goals a neural network model uses when producing its output.
Human performance is not an upper bound on LLM performance.
LLMs are trained on amounts of data that a typical human cannot comprehend - they potentially (#DOUBT Why does the author put "potentially"?) outperform humans on many tasks.
LLMs do not necessarily express the values of their programmers nor the data they are trained on.
Pretrained LLMs produce output that resembles the values in the texts which they are trained on. They can however be controlled by their developers. This means there is room for third-party input and oversight. Some methods here:
- Red teaming: Allows developers to guide a model towards a persona and sets of values more or less of their choosing.
- Constitutional AI: Reduces human input for this, allows models to be trained to follow norms and values written down as a list of constraints, which is called the constitution.
Brief interactions with them are often misleading. (#NOTE I guess they mean things like the "Let's think step by step" prompts?)
Models can often fail a task at first but do it correctly if the it rephrased - this has led to the birth of what people call prompt engineering. Often, once an appropriate prompt has been found, the model consistently does well on different instances of that task.
#TODO Continue from Section 9 onwards.