Weight Averaging


A post-training method that takes the best model weights across the training and averages them into a single model. By doing so, we overcome the optimization tendency to alternate between adjacent local minimas in the later stages of the training.

This trick doesn’t affect the training whatsoever, other than keeping a few additional weights on the disk, and can yield a substantial boost in performance and stability.

Filter terms by

Glossary Alphabetical filter

Related resources

featured image for how to measure inference time
Computer Vision
Add Your Heading Text Here
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")