1. Introduction
2. Internal Covariate Shift
Internal covariate shift refers to the phenomenon where the distribution of the activations of a neural network layer changes during training. As the parameters of the earlier layers are updated during the learning process, the subsequent layers receive inputs with varying distributions, which can slow down training and make it harder to optimize the network effectively. Batch Normalization, by normalizing the activations within each mini-batch during training, mitigates the effects of internal covariate shift, leading to faster and more stable convergence.
3. History
Reference: Ioffe, S., & Szegedy, C. (2015, June). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning (pp. 448-456). pmlr.
4. Codes
Input
import numpy as np
def batch_normalize(X, epsilon=1e-5):
# Calculate mean and variance of the batch
batch_mean = np.mean(X, axis=0)
batch_var = np.var(X, axis=0)
# Normalize the input using the mean and variance
X_normalized = (X - batch_mean) / np.sqrt(batch_var + epsilon)
return X_normalized
import numpy as np
# Sample input data (5 data points with 3 features each)
data = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12],
[13, 14, 15]])
# Perform batch normalization on the data
normalized_data = batch_normalize(data)
print("Original data:\n", data)
print("\nNormalized data:\n", normalized_data)