Quantization Aware Training

Share

A technique that introduces additional steps during training that prepare your model to be deployed in 8-bit. If deployment in 8-bit is not your plan, QAT is an unnecessary complication; but, otherwise it can be a very effective approach.

The name speaks for itself: training is performed with awareness that the inference will be done in INT8. It results in a much faster model with uncompromised accuracy. 

Filter terms by

Glossary Alphabetical filter

Related resources

sg-w&b-integration
Training
featured image for how to measure inference time
Deployment
resnet50-how-to-achieve-SOTA-accuracy-on-imagenet
Computer Vision
Share
Add Your Heading Text Here
				
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")