Lower Numerical Precision Deep Learning Inference and Training

Most commercial deep learning applications today use 32-bits of floating point precision (đť‘“đť‘ť32) for training and inference workloads. Various researchers have demonstrated that both deep learning training and inference can be performed with lower numerical precision, using 16-bit multipliers for training and 8-bit multipliers or fewer for inference with minimal to no loss in accuracy (higher precision – 16-bits vs. 8-bits – is usually needed during training to accurately represent the gradients during the backpropagation phase). Using these lower numerical precisions (training with 16-bit multipliers accumulated to 32-bits or more and inference with 8-bit multipliers accumulated to 32-bits) will likely become the standard over the next year, in particular for convolutional neural networks (CNNs)…

Download File