Speed Up Inference
With a Click of a Button

Deci’s runtime optimization tool simplifies inference acceleration. In a matter of minutes, your model is automatically compiled and quantized for your target inference hardware and is ready for production.

Boost Performance with Simplified
Runtime Optimization

Automatically compile and quantize your models with best of breed compilers and quickly evaluate different production settings in no time.

Give Your Models The Performance
They Deserve

What Compilers Are Supported?

Benchmark Before You Optimize

Easily Find The Best Hardware For The Job

Benchmark your models’ expected inference performance across multiple hardware types on Deci’s online hardware fleet. Get actionable insights for the ideal hardware and production settings.

Measure Performance Before You Train

Easily compare various models’ performance on your target inference hardware using Deci’s online hardware fleet with just a click of a button. Ensure the model runs well on your target hardware before you train it.

“At Adobe, we are committed to redefining the possibilities of digital experiences and delivering excellent AI-based solutions across a wide range of cloud and edge environments. By using Deci, we significantly shortened our time to market and transitioned inference workloads from cloud to edge devices. As a result we improved the user experience and dramatically reduced our spend on cloud inference cost.”

Pallav Vyas
Senior Engineering Manager, Document AI & Innovation at Adobe

“Our advanced text to videos solution is powered by proprietary and complex generative AI algorithms. Deci allows us to reduce our cloud computing cost and improve our user experience with faster time to video by accelerating our models’ inference performance and maximizing GPU utilization on the cloud.”

Lior Hakim
Co-Founder & CTO at HourOne

“Applied Materials is at the forefront of materials engineering solutions and leverages AI to deliver best-in-class products. We have been working with Deci on optimizing the performance of our AI model, and managed to reduce its GPU inference time by 33%. This was done on an architecture that was already optimized. We will continue using the Deci platform to build more powerful AI models to increase our inspection and production capacity with better accuracy and higher throughput.”

Amir Bar
Head of SW and Algorithm, Applied Materials

“Deci’s platform is suitable for both training and inference modes. Deci has advanced innovation in search for optimal neural network architectures. The solution excels in every area of our assessment.”

Michael Azoff
Chief Analyst, Kisaco Research

“With Deci, we increased by 2.6x the inference throughput of one of our multiclass classification models running on V100 machines – without losing accuracy. Deci can cut 50% off the inference costs in our cloud deployments worldwide.”

Chaim Linhart
CTO and Co-Founder, IBEX Medical Analysis

“The classification model I uploaded and integrated using Infery achieved a 33% performance boost, which is very cool for 10 minutes of work!”

Amir Zait
Algorithm Developer, Sight Diagnostics

“Deci delivers optimized deep learning inference on Intel processors as highlighted in MLPerf, allowing our customers to meet performance SLAs, reduce cost, decrease time to deployment, and gives them the ability to effectively scale.”

Monica Livingston
AI Solutions and Sales Director, Intel

“At RingCentral, we strive to provide our customers with the best AI-based experiences. With Deci’s platform, we were able to exceed our deep learning performance goals while shortening our development cycles. Working with Deci allows us to launch superior products faster.”

Vadim Zhuk
Senior Vice President R&D, RingCentral

“By collaborating with Deci, we aim to help our customers accelerate AI innovation and deploy AI solutions everywhere using our industry-leading platforms, from data centers to edge systems that accelerate high-throughput inference.”

Arti Garg
Head of Advanced AI Solutions & Technologies, HPE

Frequently asked questions

Absolutely. The basic plan allows you to quickly start optimizing your deep learning models. You can upgrade your plan at any time by contacting us.

 

Absolutely. The basic plan allows you to quickly start optimizing your deep learning models. You can upgrade your plan at any time by contacting us.

 

Absolutely. The basic plan allows you to quickly start optimizing your deep learning models. You can upgrade your plan at any time by contacting us.

 

Absolutely. The basic plan allows you to quickly start optimizing your deep learning models. You can upgrade your plan at any time by contacting us.

 

The Ultimate Guide to Inference Acceleration of Deep Learning-Based Applications

Learn 12 inference acceleration techniques that you can immediately implement to improve the speed, efficiency, and accuracy of your existing AI models.