or: Comparison Of Batch Normalization And Weight Normalization Algorithms For The Large-Scale Image Classification

A clear explanation of BatchNorm and WeightNorm as applied to ImageNet classification using ResNet50.

BatchNorm leads in accuracy, but pays for it by consuming 25% of the training time.