CASE STUDY

Reducing Cloud Costs and Improving UX for a Text Summarization Application

0 %
Cloud Cost Reduction
0 X
Latency Acceleration
0 %
Model Size Reduction

Customer

AI platform company

Industry

Computer software

Use case

NLP (Text summarization on NVIDIA hardware)

The Challenge

A customer developing an AI platform for text summarization was struggling to achieve satisfactory latency performance on a model powering their application. This led to a poor user experience as well as high cloud costs. The model was deployed on NVIDIA T4 GPU.

The Solution

The customer used Deci’s compilation and quantization tools to easily optimize the model performance and significantly reduce cloud cost as well as improve the user experience.

The Results

0 %
Cloud Cost Reduction
0 X
Latency Acceleration
0 %
Model Size Reduction

Use Deci’s Development Platform to:

Enable Real-Time Inference at the Edge

Improve latency and throughput and reduce model size by up to 5X while maintaining the model’s accuracy.

Process More Video Streams on Less Devices

Maximize hardware utilization and cost-efficiently scale your solution at the edge.

Deploy Your Models on Any Edge Device

Eliminate inference cloud compute cost and avoid data privacy issues by running your models directly on edge devices.

Talk to Our Experts

Tell us about your use case, needs, goals, and the obstacles in your way. We’ll show you how you can use the Deci platform to overcome them.

Book a Demo