Quantization Aware Training

Share

A technique that introduces additional steps during training that prepare your model to be deployed in 8-bit. If deployment in 8-bit is not your plan, QAT is an unnecessary complication; but, otherwise it can be a very effective approach.

The name speaks for itself: training is performed with awareness that the inference will be done in INT8. It results in a much faster model with uncompromised accuracy. 

Filter terms by

Related resources

deci-updates-tensorboard-blog-featured-2
Open Source
pytorch-training-sg-new-features-featured-2
Open Source
new-hardware-support-featured
Engineering