DEKR is a pose estimation model pretrained on COCO 2017 and THE Crowd Pose dataset.
Zigang Geng, Ke Sun, Bin Xiao , Zhaoxiang Zhang , Jingdong Wang, in the paper, “Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression”
April 6, 2021
In disentangled keypoint regression (DEKR), each branch learns the representation for one keypoint through two adaptive convolutions from a partition of feature maps output from the backbone and regresses the 2D offset of each keypoint using a 1×1 convolution separately. An illustration for three key points is shown in the figure below, the feature maps are divided into three partitions, each fed into one branch. In the experiments on COCO pose estimation, the feature maps are divided into 17 partitions and there are 17 branches for regressing the 17 key points.
DEKR adopts the multi-branch parallel adaptive convolutions to learn disentangled representations for the regression of the key points, so that each representation focuses on the corresponding keypoint region.
The DEKR model takes an image as input and predicts the human poses for all the persons in the image, where each pose consists of K keypoints, such as shoulder, elbow, and so on.
DEKR outputs an image with a regressed pose at each position, and the keypoint and center heatmaps.
Human Pose Estimation identifies and classifies the poses of human body parts and joints in images or videos. The human pose estimation can be classified into two primary approaches: bottom-up and top-down. Bottom-up methods evaluate each body joint first and then arrange them to compose a unique pose. Top-down methods run a body detector first and determine body joints within the discovered bounding boxes. Different libraries are available on the internet for human pose estimation, which includes OpenPose, DensePose, AlphaPose, and HRNet
Some real-world applications of pose estimation include:
from transformers import AutoFeatureExtractor, AutoModelForImageClassification extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50") model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")