Weight Averaging

Share

A post-training method that takes the best model weights across the training and averages them into a single model. By doing so, we overcome the optimization tendency to alternate between adjacent local minimas in the later stages of the training.

This trick doesn’t affect the training whatsoever, other than keeping a few additional weights on the disk, and can yield a substantial boost in performance and stability.

Filter terms by

Glossary Alphabetical filter

Related resources

sg-w&b-integration
Training
featured image for how to measure inference time
Deployment
resnet50-how-to-achieve-SOTA-accuracy-on-imagenet
Computer Vision
Share
Add Your Heading Text Here
				
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")