DeciNets for Edge: Enabling Real-time Latency on Constrained Hardware
- Today, more and more deep learning-based applications run inference at the edge for various reasons. But the main reason is enabling real-time latency for mission-critical tasks and/or better user experience.
- Deep learning inference on edge devices poses several challenges. First, the AI computing capabilities on the devices are limited and often cannot keep up with the fast pace of the models’ evolution. Moreover, companies are facing clients with various devices and a ‘one size fits all’ model strategy doesn’t always work. The models need to be hardware aware.
- DeciNets for Edge, is an example of how having a more efficient model can solve the deep learning inference problem on edge devices. In this example, DeciNets was optimized for the NVIDIA Jetson Xavier NX edge devices, offering the best accuracy-tradeoff performance against known state-of-the-art classification models.
DeciNets for Cloud: Scale Deep Learning at a Fraction of the Cost
- Deep learning inference on the cloud, whether in private or public cloud, offers companies almost infinite scaling capabilities of their workloads. Many companies prefer running their deep learning models on the cloud due to the high flexibility it offers, and more standardized infrastructure.
- One of the main challenges in deep learning inference on the cloud is the associated costs. Working with public cloud sums up to a huge monthly bill. And, it can be very costly to maintain a private data center while keeping all the hardware up-to-date so it can run deep learning workloads.
- DeciNets is an example of how running a more efficient model for your task can substantially reduce your cloud or hardware infrastructure bill. In this example, we present DeciNets optimized for NVIDIA T4 hardware, a popular cloud machine with excellent value for money. By using DeciNets you can save up to 50% of your cloud compute costs!
DeciNets was Automatically Generated Using AutoNAC Technology
- DeciNets was discovered using Automated Neural Architecture Construction (AutoNAC), a groundbreaking technology that redesigns your deep learning model to squeeze the maximum utilization out of the hardware targeted for inference in production.
- AutoNAC is a powerful neural architecture search and design algorithm that has full awareness of data, hardware, and the deep inference stack, including graph compilers (e.g., Tensor RT for NVIDIA GPUs and Open Vino for Intel compilers) and quantization.
- AutoNAC operates approximately two orders of magnitude faster than other known NAS technologies and is the only commercially available NAS technology offering affordable neural architecture design that can handle any deep learning task and hardware.
Get Your Version of DeciNets Now
- DeciNets is only one example of the powerful deep neural networks AutoNAC can produce, for any given computer vision task - whether in the cloud, edge, or mobile.
- AutoNAC can be applied on your use case and generate the best model in terms of accuracy-performance tradeoff, optimized for your production hardware.
- As input, the AutoNAC engine gets the deep learning task, the dataset, and the target hardware. It then produces the best architecture that solves the optimization goal for the specific inference hardware. Contact us today to learn more.