DigitalOcean Powers Workato’s Agentic Enterprise with Production-scale AI

Rhea-AI Impact

(Neutral)

Rhea-AI Sentiment

(Positive)

Key Terms

inference technical

Inference is the process of drawing a conclusion from available evidence or data, like a detective piecing together clues to form a likely story. For investors it matters because these judgments turn raw reports, test results, or market signals into expectations about future performance, risk, or regulatory outcomes—so how someone infers from the same facts can change investment decisions and valuation.

time-to-first-token technical

Time-to-first-token is a performance metric that measures the delay between sending a request to an AI language model and receiving its very first piece of output. Investors care because it reflects how quickly a product or service powered by the model responds — like the lag between pressing a doorbell and hearing someone answer — which affects user experience, system capacity, operational cost and competitiveness in products that rely on fast, interactive AI.

throughput technical

Throughput is the amount of stuff, like products or data, that a system can handle or move through in a certain period of time. For example, a factory’s throughput is how many items it produces each hour, and it matters because higher throughput usually means things are running efficiently and meeting demand quickly.

Kubernetes technical

Kubernetes is an open-source system that automates running and managing many pieces of software across groups of computers, like a conductor coordinating musicians so each piece plays at the right time and place. For investors, it matters because companies that use it can deploy updates faster, scale services up or down automatically, and cut infrastructure costs — factors that influence growth, reliability and operating margins.

GPU technical

A GPU (graphics processing unit) is a specialized computer chip designed to handle many calculations at once, originally for rendering images and video but now widely used for tasks like artificial intelligence, data analysis and high-performance computing. Investors watch GPU demand and prices because strong sales often signal growth for chip makers and their customers, affect profit margins and capital spending, and can forecast wider trends in gaming, AI adoption and cloud services.

distributed training technical

Distributed training is the process of teaching a computer model by spreading the work across multiple machines or processors so a large task gets completed faster than one machine could do alone. For investors, it matters because it lets companies develop advanced AI products more quickly and cheaply—like a construction crew dividing a big build among many workers—potentially lowering costs, shortening time to market, and creating a competitive edge that can affect future revenue and valuation.

agentic AI technical

Agentic AI refers to computer systems that can make their own decisions and take actions without needing someone to tell them what to do each time. It's like giving a robot a degree of independence to solve problems or achieve goals on its own, which matters because it could change how we work and interact with technology in everyday life.

03/03/2026 - 09:00 AM

Workato’s AI Research Lab achieves 67% higher throughput, 77% faster time-to-first-token, and 67% lower inference costs on DigitalOcean’s Agentic Inference Cloud, powered by NVIDIA

BROOMFIELD, Colo.--(BUSINESS WIRE)-- DigitalOcean (NYSE: DOCN), the Agentic Inference Cloud built for production AI, today announced that Workato’s AI Research Lab is using its vertically integrated, inference-optimized platform, accelerated by NVIDIA Hopper GPUs, to advance the development of its next-generation enterprise AI agents while materially improving performance, cost efficiency, and deployment speed.

After moving its AI Labs workloads to DigitalOcean, Workato achieved immediate gains for frontier models, including Llama-3.3-70B:

Inference cost: $0.77 power 1M tokens – 67% lower
Throughput: 13,561 tokens per second per GPU – 67% higher
Time-to-First-Token (TTFT): 1,455 ms at high load – 77% faster
Time-to-Value: Reduced from weeks to days – 2X+ acceleration

With thousands of customers globally deploying over 1 trillion tasks since 2013, the Workato ONE platform enables customers to build, deploy, and govern AI agents at an enterprise scale. Built on a decade of integration expertise spanning 14,000+ applications, Workato's platform enables organizations to move from simple automation to agentic AI that can reason, act, and orchestrate work across the entire business.

To support that vision, Workato AI Research Lab required infrastructure capable of handling distributed training and sustained, reasoning-heavy inference under real production load. Not only was DigitalOcean able to provide high-performance NVIDIA Hopper GPUs faster than any other provider for the team to begin their work, but the Workato AI Research Lab team quickly discovered the significant performance boost and TCO improvement as a result of DigitalOcean’s inference-optimized architecture and simplified experience.

“Before DigitalOcean, we didn’t have a dedicated solution for in-house training and multi-node serving, and that was a major blocker for AI research,” said Oscar Wu, AI Research Scientist at Workato. “DigitalOcean was the fastest provider to get us up and running, enabling us to advance our AI programs. The collaboration on performance optimization coupled with the support from the DigitalOcean team of solutions architects, accelerated our progress by roughly two to three times.”

Working closely with Workato, DigitalOcean helped design and tune a distributed inference architecture on DigitalOcean Kubernetes (DOKS). As part of this collaboration, DigitalOcean configured NVIDIA Dynamo to intelligently coordinate workloads across interconnected GPU clusters. This ensured requests were routed to the most efficient compute resources in real time, reducing redundant processing, lowering costs, and improving responsiveness under heavy demand.

By optimizing orchestration around the model and intelligently routing requests across interconnected NVIDIA Hopper clusters, the platform eliminated redundant computation, a primary cost driver for long-context AI workloads. The result was sustained throughput, significantly faster time-to-first-token, and materially improved price-performance under high concurrency.

“As AI adoption accelerates, inference at scale is becoming the defining challenge for the industry,” said Dave Salvator, Director of Accelerated Computing Solutions at NVIDIA. “The integration of the NVIDIA accelerated computing platform with DigitalOcean’s inference-optimized platform unlocks the full potential of production-scale AI. The significant performance gains achieved by Workato highlight the impact of this collaboration.”

For AI companies scaling production, inference economics directly impact margins, making cost efficiency and predictability critical. By moving to DigitalOcean’s optimized environment, Workato reduced inference costs by 67% to $0.77 per million tokens while achieving a 33% hardware price-performance advantage.

Equally important, DigitalOcean’s managed Kubernetes environment abstracted control-plane complexity and GPU scheduling, allowing Workato’s lean AI Labs team to focus on research and product development rather than infrastructure management. This vertically integrated approach, spanning hardware, orchestration, and networking, is critical to reduce operational overhead while improving performance consistency and cost predictability.

“DigitalOcean lets us focus on research and advancing our models instead of managing infrastructure,” said Kevin Huang, Infrastructure Engineer at Workato AI Labs. “We can provision GPUs quickly, deploy inference workloads in production, and iterate on real customer traffic without getting bogged down in platform complexity. The speed and performance has been critical to maintaining our momentum.”

“Workato is pushing the frontier of agentic enterprise software,” said Paddy Srinivasan, CEO of DigitalOcean. “As AI companies move from experimentation into production, the winners will be those who can iterate quickly on real customer workloads. We’re proud to support Workato’s momentum by providing an inference-optimized environment that lets their team focus on shipping — not managing infrastructure.”

DigitalOcean is building the Agentic Inference Cloud for production AI, partnering with ambitious AI-native companies and enabling them to operate inference reliably at scale with predictable economics. Learn more about the DigitalOcean Agentic Inference Cloud.

About DigitalOcean

DigitalOcean is the Agentic Inference Cloud built for AI-native and Digital-native enterprises scaling production workloads. The platform combines production-ready GPU infrastructure with a full-stack cloud to deliver operational simplicity and predictable economics at scale. By integrating inference capabilities with core cloud services, DigitalOcean’s Agentic Inference Cloud enables customers to expand as they grow — driving durable, compounding usage over time. More than 640,000 customers trust DigitalOcean to power their cloud and AI infrastructure. To learn more, visit www.digitalocean.com.

View source version on businesswire.com: https://www.businesswire.com/news/home/20260303412463/en/

Media Relations
Meghan Grady: press@digitalocean.com

Investor Relations
Melanie Strate: investors@digitalocean.com

Source: DigitalOcean

DigitalOcean Powers Workato’s Agentic Enterprise with Production-scale AI

Key Terms

Related Articles