A method used to normalize the data based on the dataset’s distribution. Ideally, we would like to estimate the distribution according to the entire dataset. Although this kind of estimation is not possible, BatchNorm layers can be used to evaluate the statistics of a given mini-batch throughout the training.
The paper Rethinking “Batch” in BatchNorm by Facebook AI Research, showed that these mini-batch based statistics are sub-optimal. Instead, the data statistics parameters (the mean and standard deviation variables) should be estimated across several mini-batches, while keeping the trainable parameters fixed. This method, known as Precise BatchNorm, helps improve both the stability and performance of a model.