Home / Quantization

Quantization

In deep learning, quantization is the process of substituting floating-point weights and/or activations with low precision compact representations. As a result, the memory size and computational cost of using neural networks are decreased, which can be important for edge applications. Quantization is one of several optimization methods for reducing the size of neural networks while also achieving high-performance accuracy.

Related resources

Deployment

deci-winter-release-2023-blog-featured-5

Algorithms

Algorithms

Add Your Heading Text Here

				
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")

Quantization

Related resources

Share

Add Your Heading Text Here