Selecting a GPU for Fine-Tuning

For fine-tuning, select a GPU based on how much GPU memory you will need.

But, it is a hard problem to determine how much GPU memory a LLM will need for training before training the model.

A training iteration requires a forward and backward pass of the model. In addition to storing the LLM, training also requires additional storage space such as:

Optimizer states
Gradients
Activations
Data (how much is determined by the batch size)

According to the Hugging Face Model Memory Calculator, for a batch size of 1, $$ \text{GPU Memory Estimate for Fine-Tuning (B)} = 4 \times (\text{LLM Memory in B})$$

While this formula can help ballpark an estimate, I recommend tracking GPU memory using the GPU Dashboard to make a more informed GPU selection.

For a more specific formula, see https://blog.eleuther.ai/transformer-math/ (Note: this blog requires some understanding of transformers).

Determining LLM memory for fine-tuning is an active area of research. The paper LLMem : Estimating GPU Memory Usage for Fine-Tuning Pre-Trained LLMs by Kim, et al. (April 2024) presents a method for doing so within 3% of the actual memory required.

Last updated on Feb 23, 2025