Home / Production-Aware Optimization

Production-Aware Optimization

In deep learning, production-aware optimization actively considers production constraints and requirements such as the latency, the size of the model, and the accuracy throughout the development process. By optimizing the design of neural networks for the target inference hardware and production environment, the success rate in production increases.

Related resources

Training

featured image for how to measure inference time

Deployment

resnet50-how-to-achieve-SOTA-accuracy-on-imagenet

Computer Vision

Add Your Heading Text Here

				
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")

Production-Aware Optimization

Related resources

Share

Add Your Heading Text Here