Description

YOLO-NAS-Sat is a small object detection model, pre-trained on COCO, fine-tuned on DOTA 2.0.

Publishers
Deci AI Team

Submitted Version
February 22, 2024

Latest Version
N/A 

Size
N/A 

Small Object Detection

Overview


YOLO-NAS-Sat is a small object detection model.

Model Highlights

  • Task: Small Object Detection
  • Model type: Deep Neural Network
  • Framework: PyTorch
  • Dataset: Pre-trained on COCO, fine-tuned on DOTA 2.0

Model Architecture

Building on the solid foundation of YOLO-NAS, renowned for its standard object detection, YOLO-NAS-Sat tackles the specific challenge of pinpointing small  objects. While retaining the YOLO-NAS core, we’ve implemented key changes to sharpen its focus on small objects:

  • Backbone Modifications: The number of layers in the backbone has been adjusted to optimize the processing of small objects, enhancing the model’s ability to discern minute details.
  • Revamped Neck Design: A newly designed neck, inspired by the U-Net-style decoder, focuses on retaining more small-level details. This adaptation is crucial for preserving fine feature maps that are vital for detecting small objects.
  • Context Module Adjustment: The original “context” module in YOLO-NAS, intended to capture global context, has been replaced. We discovered that for tasks like processing large satellite images, a local receptive window is more beneficial, improving both accuracy and network latency.

These architectural innovations ensure that YOLO-NAS-Sat is uniquely equipped to handle the intricacies of small object detection, offering an unparalleled accuracy-speed trade-off.

YOLO-NAS-Sat offers four distinct size variants, each tailored for different computational needs and performances:

 

Number of Parameters (In millions)

[email protected]

Latency AGX Orin (Excluding IO)

Latency NX Orin (Excluding IO)

YOLO-NAS-Sat-S

15.2M

56.4

4.48

15.99

YOLO-NAS-Sat-M

17.7M

58.21

5.7

21.01

YOLO-NAS-Sat-L

39.8M

62.14

10.08

38.40

YOLO-NAS-Sat-X

40.3M

63.38

14.3

49.34

Expected Input

The expected input of the YOLO-NAS-Sat model is an RGB image of fixed size. The image is usually preprocessed by resizing it to the desired size and normalizing its pixel values to be between 0 and 1.

Expected Output

The expected output of the YOLO-NAS-Sat model is bounding boxes and confidence scores for detected objects.

History and Applications

While YOLO-NAS-Sat excels in satellite imagery analysis, its specialized architecture is also ideally tailored for a wide range of applications involving other types of images:

  • Satellite Images: Used for environmental monitoring, urban development tracking, agricultural assessment, and military surveillance.
  • Microscopic Images: Essential in medical research for detecting cells, bacteria, and other microorganisms, as well as in material science.
  • Radar Images: Applied in meteorology for weather prediction, in aviation for aircraft navigation, and in maritime for ship detection.
  • Thermal Images: Thermal imaging finds applications in a variety of fields, including security surveillance, wildlife monitoring, and industrial maintenance, as well as in building and energy audits. The unique information provided by thermal images, especially in night-time or low-visibility conditions, underlines its importance and the volume of use.
Share
Add Your Heading Text Here
				
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")