Vision Transformer

Share

A deep learning model that converts a single input image into a sequence of image patches. Vision transformers are often used for image recognition and other image processing tasks, including object detection, image segmentation, cluster analysis, anomaly detection, and more.

Vision transformers enable efficient classification, strong modeling and scalability in a simple and straightforward way. When compared to convolutional neural networks, vision transformers are able to achieve better performance on large datasets.

Filter terms by

Glossary Alphabetical filter

Related resources

deci-small-object-detection-blog-featured
Computer Vision
deci-gtc-sessions-blog-featured
Computer Vision
deci-foundation-models-blog-featured
Algorithms
Share
Add Your Heading Text Here
				
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")