Course Content

Computer Vision Dataset Profiling & Analysis: Course Overview

Dataset Profiling Course

Welcome to the Course!

Welcome to “Computer Vision Dataset Profiling,” a comprehensive online course for data scientists and computer vision practitioners. Our primary objective is to empower you to analyze and understand your datasets effectively for better model design and training. We go beyond basic Exploratory Data Analysis (EDA) strategies and focus on teaching you to identify potential issues within your datasets, avoid common pitfalls, and improve your overall model design and training process. 

With this course, you will gain invaluable insights into the integral relationship between your datasets and the models you train and learn how careful dataset profiling can drastically improve the outcomes of your computer vision tasks. 

You will also receive an introduction to two cutting-edge tools. The first is DataGradients, a free, open-source computer vision dataset profiler. The second is Auto-NAC, an algorithmic engine that utilizes Neural Architecture Search to generate an optimal architecture for your dataset profile and target hardware, accuracy, and latency.

What to Expect

This course is divided into five core units. The first three units focus on three sets of features: image features, object detection dataset features, and semantic segmentation dataset features, respectively. Each lesson provides an in-depth examination of a particular feature, illuminating its significance in model training, presenting the associated computations, and delving into potential issues and how to solve them. 

In Unit 4, we will introduce DataGradients, a free, open-source dataset profiler recently released by Deci. DataGradients automatically profiles object detection and semantic segmentation datasets and outputs a helpful report with actionable insights. We’ll discuss the types of dataset issues DataGradients helps identify, explain how to use the tool and look at a sample report.

In the final unit, Unit 5, we will focus on the relationship between dataset characteristics and model design. We’ll discuss specific dataset characteristics and how they should guide your decision in selecting an appropriate model architecture. We’ll briefly introduce AutoNAC and explain how it uses the features of your data and other parameters to generate an optimal custom model architecture. 

Meet the Course Authors

Ofri Masad

Ofri Masad, our course guide, is Head of AI at Deci. With robust experience in computer vision, machine learning, and deep learning, he has a knack for designing algorithms, structuring systems, and enhancing software. Importantly, Ofri played a key role in developing both DataGradients and AutoNAC. He holds a Masters in Computer Science from Reichman University in Israel.

Louis Dupont

Alongside Ofri, we have Louis Dupont, a talented deep learning engineer at Deci. Louis not only contributes to DataGradients but is also a part of the team working on SuperGradients, Deci’s open-source training library for PyTorch-based computer vision models. With a passion for deep learning and a hands-on approach, Louis brings practical insights to our course. He holds a Masters in Machine Learning from Tsinghua University in Beijing.

Add Your Heading Text Here
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")