Quantization

Share

In deep learning, quantization is the process of substituting floating-point weights and/or activations with low precision compact representations. As a result, the memory size and computational cost of using neural networks are decreased, which can be important for edge applications. Quantization is one of several optimization methods for reducing the size of neural networks while also achieving high-performance accuracy.

Filter terms by

Glossary Alphabetical filter

Related resources

deci-infery-updates-blog-featured
Deployment
deci-winter-release-2023-blog-featured-5
Algorithms
deep-learning-trends-2023-header
Algorithms
Share
Add Your Heading Text Here
				
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")