Deci’s Runtime Inference Container (RTiC) is a containerized deep-learning runtime engine that enables easy deployment of models as microservices and maximizes GPU/CPU utilization for top performance.
SIGN UP FOR FREEIn today’s versatile cloud environments with so many types of hardware, frameworks, and model types, DevOps and data scientists constantly struggle to tune and deliver AI models in a microservice production environment. One of the main obstacles, inhibiting the effective use of ML models, is the challenge of serving a model for inference within its target cloud environments. While container technology has transformed the face of cloud-based IT operation, dedicated containerization of AI inference tasks has been left behind. Simply placing a model on a general-purpose server, or even inference dedicated servers, results in inefficient inference performance and unnecessary challenges for continuous optimization and tuning. Similar to general application containers, using containers to perform deep learning inference should allow faster deployment and portability of AI-models, improved developer productivity, the agility to scale on-demand, and more efficient utilization of compute resources.
RTiC, as a standard docker container, has its own file system plus dedicated inference server software and packages, all bundled together within the container. RTiC maximizes the utilization of the underlying hardware while enabling the inference of multiple models on the same hardware. You get to leverage best-of-breed open source optimization compilers, such as TensorRT and OpenVino. With RTiC, you can use standard container orchestration applications such as Kubernetes to deploy, manage, and scale microservices up or down.
A containerized deep learning runtime engine that easily turns any model into a blazing fast server.
Data Center
CPU and GPU
Cloud
CPU and GPU
Edge Server
CPU and GPU
Data Center
CPU and GPU
Cloud
CPU and GPU
Edge Server
CPU and GPU
“Using Deci’s platform we achieved a 2.6x increase in inference throughput of one of our heavy multiclass classification models running on V100 machines - without losing accuracy. Deci can cut 50% off the deep learning inference compute costs in our cloud deployments worldwide. We are very impressed by Deci's technology!”