Home / Weight Averaging

Weight Averaging

A post-training method that takes the best model weights across the training and averages them into a single model. By doing so, we overcome the optimization tendency to alternate between adjacent local minimas in the later stages of the training.

This trick doesn’t affect the training whatsoever, other than keeping a few additional weights on the disk, and can yield a substantial boost in performance and stability.

Related resources

Training

featured image for how to measure inference time

Deployment

resnet50-how-to-achieve-SOTA-accuracy-on-imagenet

Computer Vision

Add Your Heading Text Here

				
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")

Weight Averaging

Related resources

Share

Add Your Heading Text Here