Artificial intelligence has crossed a major threshold. Nearly three-quarters of organizations report using AI in at least one business function, a sharp rise from just half in 2020. As adoption picks up, industry conversations have shifted from performance at the model level to how companies can reliably and efficiently deliver these intelligent services and platforms. Without a solid infrastructure foundation, even the most advanced AI models struggle when faced with the unpredictable reality of production environments.
To better understand these demands, we turn to Karthik Puthraya, a Senior Software Engineer at Netflix who specializes in high-availability infrastructure for AI and machine learning systems.
Puthraya is responsible for the scale and performance initiatives behind Netflix’s next-generation discovery platform, the engine that curates personalized content recommendations for millions of users worldwide. His career also spans senior engineering roles at Microsoft and Stripe, where he built large-scale distributed systems for enterprise data services and regulatory compliance automation. As a Senior Member of IEEE and a member of the editorial board for ESP-IJACT, he combines practical engineering leadership with a research-informed perspective on the future of intelligent infrastructure.
“It’s one thing to develop a strong model,” Puthraya observes. “It’s another to make sure it works well for every user, on any device, under unpredictable network conditions.”
The Industry Pivot to Infrastructure-Centric AI
In the early days of the internet, engineering revolved around two goals: scale and availability. Today, integrating AI into shopping, streaming, and discovery platforms adds an additional layer of complexity. Customers increasingly expect services to anticipate their needs—and to do so without perceptible delays. IBM reports that three in five consumers now express interest in AI-driven shopping experiences, signaling a permanent shift in expectations.
At Netflix, delivering on that expectation requires more than a cutting-edge algorithm. The discovery engine requires personalization across a fragmented global landscape of devices, networks, and cultural preferences. Even small inefficiencies can accumulate into noticeable performance degradation. Anticipating these challenges, Puthraya led a series of infrastructure improvements that removed concurrency bottlenecks from critical execution paths. His optimizations sharply reduced serving latencies and streamlined the deployment pipeline, allowing more frequent updates to the services that serve the recommendations. This made it possible for Netflix to adapt to users’ browsing patterns almost minute-to-minute, without sacrificing platform stability.
“It’s easy to underestimate how much pressure these systems are under,” Puthraya explains. “You’re not just handling platform performance; you’re doing it simultaneously across millions of experiences, and under wildly different conditions. Infrastructure has to be built with those pressures in mind.”
Companies that cannot guarantee the availability and responsiveness of their AI platforms face real-world consequences: loss of user trust and escalating operational costs, and increasing exposure to regulatory risk as a consequence of AI regulations.
Engineering for Reliability—Across One Million Users
The infrastructure challenges of scaling AI systems are increasingly visible in spending patterns. According to IDC, network infrastructure now ranks as the largest cost item for generative AI training. Deployment costs are growing even faster, as enterprises move from lab experiments to production-scale AI systems.
Puthraya’s tenure at Microsoft offers a clear example of what engineering for scale requires. As part of the Microsoft Graph Data Connect team, he redesigned the system’s shared authorization service, decomposing it into horizontally scalable microservices. The rearchitecture improved the platform’s scalability, sustained over 99.99% uptime, reduced operational costs, and significantly lowered the volume of customer escalations.
“The goal of infrastructure is to build for bad days,” Puthraya notes. “The costs are obvious, but the real priority is the user experience, especially in sensitive sectors like security and business intelligence.”
At Stripe, he approached resilience from another angle: compliance automation. His work involved building systems capable of automatically collecting evidence and validating regulatory controls across complex, multi-tenant environments. In financial services, where compliance failures have direct legal and financial consequences, designing for reliable data integrity and durability can leave no margin for error.
Today’s AI deployments, particularly in fields like healthcare, finance, and national security, are inheriting similar infrastructure demands. According to Gartner, 70% of enterprises now run AI workloads across hybrid or multi-cloud environments, exposing them to greater operational instability, coordination challenges, and security vulnerabilities.
Intelligent Infrastructure for an Intelligent Future
As AI increasingly shapes critical decision-making and service delivery, the standards for infrastructure are getting more unforgiving. Downtime today can cost even small businesses up to $400 per minute. For global platforms like Netflix or Stripe, the stakes are exponentially higher.
“The principles of distributed systems engineering, like fault tolerance, aren’t outdated by AI,” Puthraya says. “They’re only becoming more relevant, and more urgent.”
Meeting the demands of AI production systems requires infrastructure that can absorb failure, recover quickly, and maintain responsiveness under unpredictable, volatile workloads. While these aren’t new engineering goals, AI’s combination of high computational demands and sensitivity to context makes achieving them substantially harder.
Where model accuracy once commanded center stage, today it’s the reliability and adaptability of the infrastructure behind AI that may determine who wins—and who falls behind—in the race toward an intelligent future.