Video conferencing (Semantic segmentation on Qualcomm hardware)
The Challenge
The customer sought to improve the latency of a person segmentation model (“Stacked-Hourglass”) which was trained on Face Synthetics, an animated dataset of facial images. The model powered a conference room application that was not achieving the targeted real time latency on their desired hardware, a Qualcomm® Snapdragon™ 888 board. The customer wanted to reduce the model’s latency to achieve the required performance while also preserving the model’s accuracy level.
The Solution
By leveraging the Deci platform the customer generated a custom model architecture that was tailored for the use case and the Qualcomm board. The new segmentation model reduced the latency by 3x from 11.6ms to 4.94 ms. In addition, the model file size was reduced by 4.47x and the memory footprint was reduced by 22%, all while preserving the original model accuracy.
The Results
0X
Latency Acceleration
0X
Model Size Reduction
0%
Lower Memory Footprint
Use Deci’s Development Platform to:
Enable Real-Time Inference at the Edge
Improve latency and throughput and reduce model size by up to 5X while maintaining the model’s accuracy.
Process More Video Streams on Less Devices
Maximize hardware utilization and cost-efficiently scale your solution at the edge.
Deploy Your Models on Any Edge Device
Eliminate inference cloud compute cost and avoid data privacy issues by running your models directly on edge devices.
Talk to Our Experts
Tell us about your use case, needs, goals, and the obstacles in your way. We’ll show you how you can use the Deci platform to overcome them.
The Ultimate Guide to Inference Acceleration
of Deep Learning-Based Applications
Learn 12 inference acceleration techniques that you can immediately implement to improve the speed, efficiency, and accuracy of your existing AI models.