YOLO-NAS Pose offers a superior latency-accuracy balance compared to YOLOv8 Pose. Specifically, the medium-sized version, YOLO-NAS Pose M, outperforms the large YOLOv8 variant with a 38.85% reduction in latency on an Intel Xeon 4th gen CPU, all while achieving a 0.27 boost in [email protected] score.

DeciDiffusion 1.0 is an 820 million parameter text-to-image latent diffusion model trained on the LAION-v2 dataset and fine-tuned on the LAION-ART dataset.

DeciLM 6B is a 5.7 billion parameter decoder-only text generation model. It outpaces pretrained models in its class, with a throughput that's up to 15 times that of Llama 2 7B's.

DeciCoder 1B is a 1 billion parameter decoder-only code completion model trained on the Python, Java, and Javascript subsets of Starcoder Training Dataset.

YOLO-NAS is a groundbreaking object detection foundational model pre-trained on prominent datasets such as COCO, Objects365, and evaluated on COCO and RF100 dataset.


Dive into Google’s T5, a powerful Text-to-Text Transformer model. Understand its capabilities, applications, and how to use it efficiently.

DEKR is a pose estimation model pretrained on COCO 2017 and THE Crowd Pose dataset. It was introduced on April 06, 2021, in the paper titled, “Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression”, by Zigang Geng, Ke Sun, Bin Xiao , Zhaoxiang Zhang , Jingdong Wang.

YOLOX is an object detection model that was introduced on August 06, 2021, in the paper titled, “YOLOX - Exceeding YOLO Series in 2021."


Vision Transformers is a novel approach to image classification tasks that capture long-range dependencies between patches in an image.

EfficientNet is a convolutional neural network (CNN) architecture pre-trained on CIFAR-10 and CIFAR-100, Birdsnap, Stanford Cars, Flowers, FGVC Aircraft, Oxford-IIIT Pets, and Food-101 datasets.

ResNet is an image classification model pre-trained on ImageNet-1k at Resolution 224×224 datasets.

PP-LiteSeg is a lightweight real-time semantic segmentation model that uses a modified encoder-decoder architecture that incorporates three similarly novel modules: Flexible and Lightweight Decoder (FLD), Unified Attention Fusion Module (UAFM), and Simple Pyramid Pooling Module (SPPM).

					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")