Tensor Parallelism (Intra-Layer)

Tensor parallelism (intra-layer) splits tensor computations across GPUs. Each GPU contains part of the tensor. GPUs compute on their part in parallel. THis is good for transformer layers. However, it requires frequent communication/fast interconnect (e.g. NVLink, which UVA HPC has in the BasePOD).

Source: https://colossalai.org/docs/concepts/paradigms_of_parallelism/#tensor-parallel

Last updated on Jul 8, 2025