Effective Use of HPC for Deep Learning
- Optimize single-node/single-GPU performance?
- Use performance analysis tools
- Tune and optimize data pipeline
- Make effective use of the hardware (e.g. mixed precision)
- Before running on multiple nodes, make sure the job can scale well to 8 GPUs on a single node. Never do one GPU per node for multinode jobs.
- Use multiple CPU cores for data loading.
- For hyperparameter tuning consider using a job array. This will allow you to run multiple jobs with one
sbatch
command.
You will likely want the code loading data to look like this:
train_dataset = datasets['train'].map(preprocess_data, num_parallel_calls=tf.data.AUTOTUNE) .cache() .shuffle(SHUFFLE_SIZE) .batch(BATCH_SIZE) .prefetch(tf.data.AUTOTUNE)
Questions or Need more help?
Office Hours via Zoom:
- Tuesdays: 3 pm - 5 pm
- Thursdays: 10 am - noon
Zoom Links are available at https://www.rc.virginia.edu/support/#office-hours
- Website: https://rc.virginia.edu