In the first blog of this series, we covered the challenges of training LLMs. The previous blog reviewed the LLM consumption models for enterprises investing in AI (Makers, Takers, Shapers, and RAG). In this blog, we will review the deployment models for your AI applications and the cost considerations for each.
The adoption of AI is driving a strong uptick in AI data center growth and investment. Once cornered almost exclusively by cloud providers, enterprises are now expanding their AI data center footprint with private data centers to take more control over their AI workloads and applications.
Making the most of your AI data center investment
According to IDC, enterprise investment in AI data center switching equipment will grow to USD $1B by 2027 with a CAGR of 158%. While cloud provider switching growth is predicted to continue at a strong CAGR of 91.5% in the same period, some enterprises intend to peel off some of their training and inferencing workloads into private data centers to adopt a hybrid cloud strategy.
To achieve meaningful results with GenAI, most enterprises are discovering that they must use their own corporate data in the models. Digital transformation via AI is not as easy as mandating that employees use public LLMs for their work.
Build vs. Buy: That is the question
The decision of whether to build your own private data centers, buy AI services from public cloud providers, or use a combined hybrid cloud model, boils down to a few big rock considerations:
- Data sensitivity: Are you dealing with sensitive or proprietary data that needs to stay local to a private cloud or is subject to data sovereignty rules where data is restricted to a geographic boundary? Technology, financial, government, and medical use cases are more likely to require private data centers to protect intellectual property or avoid litigation uncertainties.
- Expertise: How deep is your in-house data science or networking expertise? If you have the appropriate personnel, deploying private data centers is a strong option. If you don’t, expertise must be developed or outsourced.
- Geography: Do you have adequate facilities in desired locations to support your data center needs? Drawing 700W per GPU, large training clusters may require costly power upgrades to existing facilities. Alternatively, enterprises may choose to distribute their AI clusters across multiple data center locations to stay within the power budget. Inferencing performance including RAG may drive AI data centers even further toward the edge where small AI clusters are physically closer to users. For example, IoT applications deployed on the manufacturing floor. A hybrid architecture allows enterprises to build where they can and buy where they must in order to locate AI functions for training, inferencing, and RAG in optimized locations.
- Time-to-market: What is your time-to-market pressure? If it’s immediate, public cloud services will speed up time-to-market, providing valuable time to properly plan your private data center deployments. And where are you on your AI transformation journey? Are you at the beginning and still need heavy experimentation to see what works for your business? If so, go with public cloud. But if you’re committed to AI and have a plan for how to use it in different parts of your business, the economic analysis usually dictates that you make the investment in private cloud infrastructure.
Corporate strategy: Like many cloud transformation projects, AI initiatives often begin at the departmental level, creating islands of AI clusters to solve specific customer or operational challenges. As enterprises develop overarching corporate strategies with more unified and shared AI infrastructures, AI investment costs can be amortized more efficiently, allowing private data center AI investments to fit into existing corporate budgets.
Initially, public cloud was the only alternative for innovators and early adopters of AI technology. While public cloud remains an important part of most AI strategies, concerns over data security and cost are putting private data center and hybrid cloud architectures into the mainstream. At Juniper’s recent virtual event, Seize the AI Moment, customers, partners, and industry experts discussed their own hybrid cloud use cases and strategies, including data security concerns for a financial institution and the strategy of using hybrid clouds to balance cost and performance.
Cost: Maximizing ROI in an expensive AI world
Regardless of the deployment model, it’s no secret that deploying AI is expensive. The cost of AI is measured in terms of budget, expertise, and time, all of which are commodities that are resource-constrained. While expertise and time are variable costs that are unique per enterprise, the hard dollar investments in AI are market-driven and challenged only by the allocated budget.
At roughly $400,000 per GPU server, the infrastructure costs alone for a small AI data center can cost millions of dollars. However, there may be some relief in sight. AI frameworks like PyTorch 2.0 have eliminated the tight integration and dependency on NVIDIA chipsets. This opens the door to competitive GPU offerings from Intel, AMD, and others to disrupt market dynamics and normalize costs.
At current levels, it’s easy to assume that buying AI services from a public cloud provider would be more cost effective than building a private AI data center. But a recent total cost of ownership (TCO) analysis from ACG Research shows otherwise. By comparing the three-year TCO of a private AI data center with that of a comparable public cloud hosted AI service, ACG found the private data center model demonstrates a 46% TCO saving, primarily due to the high recurring costs associated with public cloud services.
The ACG report further analyzed the cost of building an AI data center, comparing the costs of InfiniBand networking to that of Ethernet. In their findings, ACG shows that deployments of Juniper Ethernet with RoCE v2 and Juniper Apstra result in 55% TCO savings, including 56% OpEx savings and 55% CapEx savings over the first three years vs. InfiniBand networks.
The net result is that building AI data centers is a cost-effective option to offset the premium of the public cloud and that building Ethernet-based AI fabrics further reduces hard dollar costs, using existing in-house expertise for fast deployments.
Simplifying private enterprise and hybrid AI deployments
Application needs, deployment models, and cost are big rock considerations when investing in private or hybrid cloud models. But enterprises don’t have to do it alone. AI is new to most enterprises and AI infrastructure does present complex challenges, but it’s not sorcery—much of your current data center networking knowledge applies. Juniper has invested in AI technology to drive innovation while simplifying the path to AI. Using a multivendor architecture of GPU compute, advanced storage platforms, and rail-optimized Juniper Ethernet fabrics of QFX Series switches, PTX Series routers, and Juniper Apstra network automation, Juniper’s Ops4AI Lab is open to enterprises to qualify custom AI cluster designs prior to deployment. Using both open-source and Bring Your Own Models (BYOM), the AI Innovation Lab is there to help customers eliminate uncertainty.
Juniper’s Ops4AI Lab and you
Juniper’s Ops4AI Lab is also the engine that drives our Juniper Validated Design (JVD) pipeline of pre-validated, multivendor AI data centers. Through JVDs, enterprises can mix and match supported compute and storage infrastructures with Juniper’s advanced AI data center designs to eliminate guesswork and simplify what is often a complex design and deployment.
Using Juniper Apstra network automation, enterprises can design blueprints to be used for custom deployments or use Juniper’s Terraform provider to download GitLab-hosted AI blueprints for back-end compute, back-end storage, and front-end management.
To hear more about Juniper’s Ops4AI Lab, customer AI use cases, and our ecosystem of AI partners, including NVIDIA, AMD, WEKA, Intel, and more, listen to the replay of ‘Seize the AI Moment’, Juniper’s AI virtual event or follow this link to our AI Data Center landing page.