Home / Batch Accumulation

Batch Accumulation

When you use a model ‘off the shelf,’ it generally comes with a suggested training recipe. The thing is, these models are usually trained on very powerful GPUs, which may mean the recipe is not necessarily appropriate for your target hardware. Reducing the batch size to accommodate your hardware will likely require tuning other parameters as well and you won’t always get the same training results.

To overcome this issue, you can perform several consecutive forward steps over the model, accumulate the gradients, and backpropagate them once every few batches. This mechanism is known as batch accumulation.

Related resources

Training

featured image for how to measure inference time

Deployment

resnet50-how-to-achieve-SOTA-accuracy-on-imagenet

Computer Vision

Add Your Heading Text Here

				
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")

Batch Accumulation

Related resources

Share

Add Your Heading Text Here