Terminology

LLMs can have billions of parameters (unknown quantities in the model).

An LLM is trained (model parameters are determined) using a very large amount of text data.

A pre-trained LLM has already been trained. This process allows the model to “learn” the language.

A fine-tuned LLM is a pre-trained LLM that then is further trained on additional data for a specific task. Model parameters are updated in the fine-tuning process.

Inference is the process in which a trained (or fine-tuned) LLM makes a prediction for a given input.

The input to an LLM is a sequence of tokens (words, characters, subwords, etc.)

The batch size for an LLM is the number of sequences passed to the model at once.

Generally, raw text is passed through a tokenizer which processes it into tokens and sequences and then numerical data which can be sent to the LLM.

Last updated on Feb 23, 2025