Course Content

Lesson 4.2: How to Use DataGradients

Are you interested in leveraging DataGradients to profile your computer vision datasets and improve your object detection and semantic segmentation model training and design? In this lesson, we’ll show you how easy it is to use. We’ll first walk you through the steps you’ll need to take to run and customize the tool. We’ll then introduce you to the DataGradients’s output, a PDF report with a detailed analysis of the key features of your dataset.

Installation

To get started with DataGradients, simply pip-install the package with the following: 

				
					pip install data-gradients. 

				
			

 

Running DataGradients

To run DataGradients on an object detection dataset, use the following code: 

				
					da = DetectionAnalysisManager(
report_title="Testing Data-Gradients Detection",
train_data=train_set,
val_data=val_set,
class_names=train_set.class_names,
)

da.run()
				
			

 

To run it on a semantic segmentation dataset, use this code:

				
					da = SegmentationAnalysisManager(
report_title="Testing Data-Gradients",
train_data=train_set,
val_data=val_set,
class_names=train_set.class_names,
)

da.run()
				
			

 

By default, DataGradients analyzes all features of your dataset. However, you can choose the features you want to focus on. The features are all defined in the configuration file, which you can change by inserting a comment, excluding certain features without digging into the Python code.

For object detection:

				
					report_sections:
  - name: Image Features
    features:
      - SummaryStats
      - ImagesResolution
      - ImageColorDistribution
      - ImagesAverageBrightness
  - name: Object Detection Features
    features:
      - DetectionSampleVisualization:
          n_rows: 3
          n_cols: 4
          stack_splits_vertically: True
      - DetectionClassHeatmap:
          n_rows: 6
          n_cols: 2
          heatmap_shape: [200, 200]
      - DetectionBoundingBoxArea:
          topk: 30
          prioritization_mode: train_val_diff
      - DetectionBoundingBoxPerImageCount
      - DetectionBoundingBoxSize
      - DetectionClassFrequency:
          topk: 30
          prioritization_mode: train_val_diff
      - DetectionClassesPerImageCount:
          topk: 30
          prioritization_mode: train_val_diff
      - DetectionBoundingBoxIoU:
          num_bins: 10
          class_agnostic: true

				
			

 

For semantic segmentation:

				
					report_sections:
  - name: Image Features
    features:
      - SummaryStats
      - ImagesResolution
      - ImageColorDistribution
      - ImagesAverageBrightness
  - name: Segmentation Features
    features:
      - SegmentationSampleVisualization:
          n_rows: 3
          n_cols: 3
          stack_splits_vertically: True
          stack_mask_vertically: True
      - SegmentationClassHeatmap:
          n_rows: 6
          n_cols: 2
          heatmap_shape: [200, 200]
      - SegmentationClassFrequency:
          topk: 30
          prioritization_mode: train_val_diff
      - SegmentationClassesPerImageCount:
          topk: 30
          prioritization_mode: train_val_diff
      - SegmentationComponentsPerImageCount
      - SegmentationBoundingBoxResolution
      - SegmentationBoundingBoxArea:
          topk: 30
          prioritization_mode: train_val_diff
      - SegmentationComponentsConvexity
      - SegmentationComponentsErosion

				
			

 

DataGradients supports a variety of datasets. It has a dataset adapter __getitem__() that takes as input any one of the following dataset types, including:

Numpy.ndarray

PIL.Image

Python dictionary (Mapping)

Python list / tuple (Sequential)

 

And outputs a tuple (images, labels). If the dataset you enter is a dictionary, you’ll be asked to indicate the image and the label.

Here’s an example of full flow of questions:

				
					------------------------------------------------------------------------
Which tensor represents your Image(s) ?
------------------------------------------------------------------------
This is how your data is structured: 
data = {
    "image": "ndarray",
    "annotation": {
        "bbox": "ndarray",
        "image_id": "int",
        "segmentation": "ndarray"
    }
}

Options:
[0] | data.image: ndarray
[1] | data.annotation.bbox: ndarray
[2] | data.annotation.image_id: int
[3] | data.annotation.segmentation: ndarray

Your selection (Enter the corresponding number) >>> 0
Great! You chose: data.image: ndarray


------------------------------------------------------------------------
Which tensor represents your Label(s) ?
------------------------------------------------------------------------
This is how your data is structured: 
data = {
    "image": "ndarray",
    "annotation": {
        "bbox": "ndarray",
        "image_id": "int",
        "segmentation": "ndarray"
    }
}

Options:
[0] | data.image: ndarray
[1] | data.annotation.bbox: ndarray
[2] | data.annotation.image_id: int
[3] | data.annotation.segmentation: ndarray

Your selection (Enter the corresponding number) >>> 1
Great! You chose: data.annotation.bbox: ndarray


------------------------------------------------------------------------
Which comes first in your annotations, the class id or the bounding box?
------------------------------------------------------------------------
Here's a sample of how your labels look like:
Each line corresponds to a bounding box.
tensor([[  6., 156.,  97., 351., 270.]], dtype=torch.float64)

Options:
[0] | Label comes first (e.g. [class_id, x1, y1, x2, y2])
[1] | Bounding box comes first (e.g. [x1, y1, x2, y2, class_id])

Your selection (Enter the corresponding number) >>> 0
Great! You chose: Label comes first (e.g. [class_id, x1, y1, x2, y2])


------------------------------------------------------------------------
What is the bounding box format?
------------------------------------------------------------------------
Here's a sample of how your labels look like:
Each line corresponds to a bounding box.
tensor([[  6., 156.,  97., 351., 270.]], dtype=torch.float64)

Options:
[0] | xyxy: x-left, y-top, x-right, y-bottom		(Pascal-VOC format)
[1] | xywh: x-left, y-top, width, height			(COCO format)
[2] | cxcywh: x-center, y-center, width, height		(YOLO format)

Your selection (Enter the corresponding number) >>> 2
Great! You chose: cxcywh: x-center, y-center, width, height		(YOLO format)

				
			

If you’re using DataGradients to analyze a dataset for semantic segmentation, DataGradients will automatically convert your images into (3, H, W) and your labels into (CHW).  However, if you’re using it for object detection, some formatting actions will be required:

Label first vs. Label last

Bbox format (xyxy, xywh, cxcywh)

Other actions will be automatic:

  • Images will be automatically formatted into (3, H, W)
  • Label will be normalized if necessary 
  • Labels will be automatically formatted into (label, x, y, x, y), not normalized

The duration it takes for DataGradients to run varies depending on the size of your dataset. You can expect a completion time of between 1 to 10 minutes for smaller datasets. However, the process may take several hours for larger, more extensive datasets. DataGradients offers a feature where you can limit the number of samples used, allowing you to manage the running time as needed.

The DataGradients Report 

Upon completion of its analysis, DataGradients generates two types of outputs:

  1. A comprehensive PDF report that includes statistics, graphs, and visualizations of the dataset features you’ve chosen for examination.
  2. A JSON file devoid of any original data from your dataset but filled with all the extracted raw metadata.


The standard report, examining all essential dataset attributes for your specific task, offers an all-encompassing perspective of your data. It empowers you with actionable insights, illuminating your dataset’s possible issues and distinct characteristics.

Are you interested in delving deeper into DataGradients’ reports? We invite you to examine the reports created by DataGradients for two widely-used datasets: Coco for object detection and CityScapes for semantic segmentation.

 

Share
Add Your Heading Text Here
				
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")