Transformer-specific Latency Advantages Transformer module latency reductions per module. The diagram above highlights key transformer components and shows how latency can be reduced in each part using optimized operations, such as fused QKV and faster bias-add kernels. Previous DeepSpeed Next Advantages of DeepSpeed