Cloud NVLink H200 pricing: Runpod, Lambda, CoreWeave for LLM training

Back in March 2026, we saw a research paper claim a 1.8 trillion parameter model fine-tuned in under 48 hours, and the first question wasn’t ‘how?’ but ‘what did their cloud bill look like?’. The answer, almost certainly, involved a lot of H200s, all talking to each other at absurd speeds. These aren’t your typical single-GPU experiments; these are the workloads that demand high-bandwidth, low-latency interconnects to keep all those massive tensor operations flowing. So, we dug into the vendor pages, pricing sheets, and sales collateral to see what the actual landscape for multi-GPU NVLink H200 rentals looks like as of June 2026.

Why NVLink H200s are crucial for large LLM training

When you’re pushing past the 70B parameter mark for LLM training, especially with larger batch sizes or sequence lengths, a single H200 — with its 141GB of HBM3e memory and impressive bandwidth — often isn’t enough. The real bottleneck isn’t just the raw compute, but how quickly those GPUs can exchange data. This is where NVLink comes in. NVLink provides a dedicated, high-speed, low-latency pathway between GPUs, bypassing the slower PCIe bus. For multi-GPU training, this means gradients can be exchanged, and model weights synchronized far more efficiently, preventing GPUs from sitting idle while waiting for data from their neighbors.

While the H100 was already a beast, the H200 offers a significant bump in HBM3e memory (141GB vs 80GB) and memory bandwidth (4.8 TB/s vs 3.35 TB/s), which directly translates to supporting larger models or longer sequences per GPU, and faster data movement. For truly massive LLM training projects, 8x H200 NVLink configurations are becoming the de facto standard. Trying to run these scale workloads without NVLink is like trying to fill a swimming pool with a garden hose – you’ll get there eventually, but you’ll pay a lot more for the privilege, and wait a lot longer. We’ve talked before about H200 cloud pricing and even done an NVLink H100 comparison, but the H200’s increased VRAM makes NVLink even more critical for keeping those behemoth models fed.

Runpod’s NVLink H200 offerings and on-demand pricing

Runpod has positioned itself as a flexible, on-demand option for cutting-edge GPUs, and their H200 offerings are no exception. As of June 2026, their public pricing page indicates that multi-GPU H200 pods with NVLink are available, typically in 8x GPU configurations. This is where the elasticity of on-demand really shines: you spin up a pod, you train, you tear it down. No long-term commitments, no dealing with resource allocation queues beyond immediate availability.

For an 8x H200 NVLink pod, Runpod’s on-demand pricing is approximately $280.00/hour, per their pricing page as of 2026-06-13 [https://www.runpod.io/gpu-cloud/h200]. This translates to roughly $35.00 per GPU per hour. The infrastructure includes dedicated NVMe storage (often 1.6TB or more) and substantial system RAM, which is crucial when dealing with datasets that frequently spill over from HBM. While Runpod also offers a serverless option, for multi-GPU training, we’re almost always looking at their bare-metal pods where you have full control over the environment. If you’re new to Runpod, our general Runpod review covers some of the quirks, but for raw GPU access, it’s often a solid pick.

Lambda Labs’ NVLink H200 options and reservation models

Lambda Labs has long been a go-to for predictable GPU access, largely through their emphasis on reserved instances. While they do offer some on-demand capacity, their H200 NVLink offerings are primarily structured around longer-term commitments. This is often the preferred model for research labs or companies with consistent, large-scale training workloads, as it provides cost predictability and guaranteed resource availability – something you don’t always get with pure spot or on-demand markets.

Lambda Labs offers H200 instances with NVLink, typically through reservations, with pricing starting around $180,000/month for an 8x H200 cluster, per their website as of 2026-06-13 [https://lambdalabs.com/service/gpu-cloud/h200]. This monthly reservation, assuming 720 hours in a month, works out to an effective hourly rate of approximately $31.25 per GPU per hour if you’re utilizing the cluster constantly. These reservations usually come with high-performance networking, ample local NVMe storage, and dedicated CPU cores. The trade-off, of course, is the upfront commitment and the potential for queues if you’re trying to secure a new, large reservation. Our overall Lambda Labs review highlighted their predictable billing and solid hardware, often worth the wait for committed projects.

CoreWeave: enterprise-grade NVLink H200 clusters

CoreWeave operates at a slightly different scale and market segment compared to Runpod and Lambda Labs. They focus heavily on large-scale, enterprise clients and specialized workloads, often securing significant allocations of the latest NVIDIA hardware. For NVLink H200 clusters, this means they’re primarily targeting customers who need large, stable environments for months or even years, often with bespoke configurations and service level agreements (SLAs).

CoreWeave primarily offers NVLink H200 clusters to enterprise clients via custom quotes and long-term commitments, as stated on their GPU cloud page as of 2026-06-13 [https://www.coreweave.com/gpu-cloud/nvidia-h200]. You won’t find public hourly pricing for H200s on their website in the same way you would for Runpod or even Lambda’s on-demand. Instead, their sales team works directly with clients to build custom solutions. While this typically means higher minimum commitments and less flexibility for small, burstable projects, it can result in highly competitive pricing for very large, stable workloads where economies of scale and direct hardware access are paramount. If you’re a startup with a few million in funding, you’re likely talking to CoreWeave. If you’re a solo researcher, probably not.

Comparing NVLink H200 costs for multi-GPU LLM projects

Pulling these numbers together for a direct comparison requires some approximation, especially with CoreWeave’s opaque pricing. We’re looking at an 8x H200 NVLink cluster as the baseline, which is a common configuration for serious LLM training. Remember, these are vendor-published rates as of June 2026 and can change rapidly, particularly for high-demand hardware like the H200.

Provider	Configuration	Pricing Model	Est. $/GPU/hour (8x NVLink)	Availability Notes
Runpod	8x H200, 1.6TB NVMe	On-demand	~$35.00	Often available, but can be limited; hourly billing.
Lambda Labs	8x H200, 1.6TB NVMe	Reserved (Monthly)	~$31.25 (effective)	Requires commitments (e.g., 1 month+); queues.
CoreWeave	Custom (8x H200+)	Custom Quote/Commit	(Competitive for enterprise)	Not publicly listed; enterprise contracts only; long-term.

The ‘effective’ hourly rate for Lambda Labs assumes you’re using the instance for all 720 hours in a month. If your workload is bursty or you only need it for a week, that effective rate skyrockets, making Runpod’s on-demand more attractive. Conversely, if you have a continuous 3-month training run, Lambda’s reservation model offers better price predictability and a lower amortized cost. CoreWeave remains a wildcard for most, but if you’re moving petabytes of data and committing to a year-long project, their custom rates are likely to be aggressive. Storage performance and egress are also factors, but for multi-GPU training, the raw GPU and NVLink cost dominates the bill.

Which cloud NVLink H200 setup is right for your LLM training?

Choosing the right cloud NVLink H200 setup boils down to the elasticity of your project and your comfort with commitment. If you’re an independent researcher, a startup doing iterative experiments, or need to burst compute for a few days, Runpod’s on-demand cloud NVLink H200 pricing is likely your best bet. You pay for what you use, when you use it, and you can scale up and down without penalty beyond the hourly rate. If you want to try a similar workload, our referral link is the easiest way to get started.

For established teams with consistent, long-running training jobs and a clear budget, Lambda Labs’ reservation model offers a more predictable and often lower effective cost. The trade-off is the upfront commitment and potential waiting times, but for a stable project, that certainty is invaluable. CoreWeave, on the other hand, is firmly in the enterprise camp. If you’re planning multi-year projects, demanding specific hardware topologies, or require white-glove support, they’re the ones to call – but don’t expect to spin up a single instance for an afternoon’s work. Ultimately, for most teams, the choice is between Runpod’s immediate flexibility and Lambda’s long-term cost efficiency.

Cloud NVLink H200 pricing: Runpod, Lambda, CoreWeave for LLM training

Why NVLink H200s are crucial for large LLM training

Runpod’s NVLink H200 offerings and on-demand pricing

Lambda Labs’ NVLink H200 options and reservation models

CoreWeave: enterprise-grade NVLink H200 clusters

Comparing NVLink H200 costs for multi-GPU LLM projects

Which cloud NVLink H200 setup is right for your LLM training?

Monthly cost of NVLink H200s for LLM training

Nvidia L40 48GB vs A100 40GB: better value for LLM inference?

Intel Gaudi 3 cloud: is it ready to challenge H200 pricing?

Dual A100 40GB vs H100 80GB: where to train LLMs?