Microsoft at Nvidia GPU Technology & AI Conference (GTC) 2025
I attended Nvidia GTC 2025 this year, with the goal to keep up to date with the latest technologies, meet members of the community, and share these learnings.
Together, Microsoft and NVIDIA are accelerating some of the most groundbreaking innovations in AI. This long-standing collaboration has been at the core of the AI revolution over the past few years, from bringing industry-leading supercomputing performance in the cloud to supporting breakthrough frontier models and solutions like ChatGPT in Microsoft Azure OpenAI Service and Microsoft Copilot.
Today, there are several new announcements from Microsoft and NVIDIA that further enhance the full stack collaboration to help shape the future of AI. This includes integrating the newest NVIDIA Blackwell platform with Azure AI services infrastructure, incorporating NVIDIA NIM microservices into Azure AI Foundry, and empowering developers, startups, and organizations of all sizes like NBA, BMW, Dentsu, Harvey and OriGen, to accelerate their innovations and solve the most challenging problems across domains. — Azure Blog
Empowering all developers and innovators with agentic AI
Microsoft and NVIDIA collaborate deeply across the entire technology stack, and with the rise of agentic AI, they are thrilled to share several new offerings that are available in Azure AI Foundry. First is that Azure AI Foundry now offers NVIDIA NIM microservices. NIM provides optimized containers for more than two dozen popular foundation models, allowing developers to deploy generative AI applications and agents quickly. These new integrations can accelerate inferencing workloads for models available on Azure, providing significant performance improvements, greatly supporting the growing use of AI agents. Key features include optimized model throughput for NVIDIA accelerated computing platforms, prebuilt microservices deployable anywhere, and enhanced accuracy for specific use cases. In addition, we will soon be integrating the NVIDIA Llama Nemotron Reason open reasoning model. NVIDIA Llama Nemotron Reason is a powerful AI model family designed for advanced reasoning.
At the same time, Microsoft is expanding its model catalog in Azure AI Foundry even further with the addition of Mistral Small 3.1, which is coming soon, an enhanced version of Mistral Small 3, featuring multimodal capabilities and an extended context length of up to 128k.
Microsoft is also announcing the general availability of Azure Container Apps serverless graphics processing units (GPUs) with support for NVIDIA NIM. Serverless GPUs allow enterprises, startups, and software development companies to seamlessly run AI workloads on-demand with automatic scaling, optimized cold start, and per-second billing with scale down to zero when not in use to reduce operational overhead. With the support of NVIDIA NIM, development teams can easily build and deploy generative AI applications alongside existing applications within the same networking, security, and isolation boundary.
Expanding Azure AI Infrastructure with NVIDIA
The evolution of reasoning models and agentic AI systems is transforming the artificial intelligence landscape. Robust and purpose-built infrastructure is key to their success. Today, Microsoft is excited to announce the general availability of Azure ND GB200 V6 virtual machine (VM) series accelerated by NVIDIA GB200 NVL72 and NVIDIA Quantum InfiniBand networking. This addition to the Azure AI Infrastructure portfolio, alongside existing virtual machines that use NVIDIA H200 and NVIDIA H100 GPUs, highlight Microsoft’s commitment to optimizing infrastructure for the next wave of complex AI tasks like planning, reasoning, and adapting in real-time.
As we push the boundaries of AI, our partnership with Azure and the introduction of the NVIDIA Blackwell platform represent a significant leap forward. The NVIDIA GB200 NVL72, with its unparalleled performance and connectivity, tackles the most complex AI workloads, enabling businesses to innovate faster and more securely. By integrating this technology with Azure’s secure infrastructure, we are unlocking the potential of reasoning AI.
Ian Buck, Vice President of Hyperscale and HPC, NVIDIA
The combination of high-performance NVIDIA GPUs with low-latency NVIDIA InfiniBand networking and Azure’s scalable architectures are essential to handle the new massive data throughput and intensive processing demands. Furthermore, comprehensive integration of security, governance, and monitoring tools from Azure supports powerful, trustworthy AI applications that comply with regulatory standards.
Built with Microsoft’s custom infrastructure system and the NVIDIA Blackwell platform, at the datacenter level each blade features two NVIDIA GB200 Grace™ Blackwell Superchips and NVIDIA NVLink™ Switch scale-up networking, which supports up to 72 NVIDIA Blackwell GPUs in a single NVLink domain. Additionally, it incorporates the latest NVIDIA Quantum InfiniBand, allowing for scaling out to tens of thousands of Blackwell GPUs on Azure, providing two times the AI supercomputing performance from previous GPU generations based on GEMM benchmark analysis.
As Microsoft’s work with NVIDIA continues to grow and shape the future of AI, the company also looks forward to bringing the performance of NVIDIA Blackwell Ultra GPUs and the NVIDIA RTX PRO 6000 Blackwell Server Edition to Azure. Microsoft is set to launch the NVIDIA Blackwell Ultra GPU-based VMs later in 2025. These VMs promise to deliver exceptional performance and efficiency for the next wave of agentic and generative AI workloads.
Azure AI’s infrastructure, advanced by NVIDIA accelerated computing, consistently delivers high performance at scale for AI workloads as evidenced by leading industry benchmarks like Top500 supercomputing and MLPerf results.1,2 Recently, Azure Virtual Machines using NVIDIA’s H200 GPUs achieved exceptional performance in the MLPerf Training v4.1 benchmarks across various AI tasks. Azure demonstrated leading cloud performance by scaling 512 H200 GPUs in a cluster, achieving a 28% speedup over H100 GPUs in the latest MLPerf training runs by MLCommons.3 This highlights Azure’s ability to efficiently scale large GPU clusters. Microsoft is excited that customers are utilizing this performance on Azure to train advanced models and get efficiency for generative inferencing.
Microsoft hosted many talks and panels, showcasing Azure’s end-to-end AI platform with models and pre-trained AI services, comprehensive GenAIOps tools, and scalable, performance infrastructure.
Featured sessions
S74600: Unlock AI Potential with Azure Infrastructure: Innovations, Best Practices
Tuesday, March 18, on demand
Kanchan Mehrotra, Group Technical Manager, Azure HPC & AI
Kyle Esecson, Head of Commercial Strategy, Wayve
S74606: Build and run next generation AI on Azure’s proven AI Infrastructure platform
Wednesday, March 19, 10:00 AM — 10:40 AM
Nidhi Chappell, Vice President and Head of AI infra for Azure; Omar Khan, VP, Azure Infrastructure Marketing
S74611: Accelerate AI Innovation with NVIDIA and Azure AI Foundry
Wednesday, March 19, 2:00 PM- 2:40 PM
Mike Hulme, General Manager, Azure Digital and App Innovation
S74603: AI Inferencing using NIM with Serverless GPUs
Thursday, March 20, 9:00 AM — 9:40 AM
Devanshi Joshi, Sr. Product Marketing Manager, Azure Digital Applications
Harry Li, Software Engineer, Microsoft
Cary Chai, Product Manager, Azure Digital Applications
Quantum day
S74495: Quantum Computing: Where We Are and Where We’re Headed
Thursday, March 20 | 10:00 AM — 12:00 PM
Jensen Huang, Founder and CEO, NVIDIA
Krysta Svore, Technical Fellow, Microsoft and more…
Talks and panels
S72436: Building a 3D image-based search system for medical images
Monday, March 17, 1:00 PM to 1:40 PM
S72905: GPU DiskANN: Accelerating Microsoft Vector Search with NVIDIA cuVS
Monday, March 17, 1:00 PM — 1:40 PM
S71676: Accelerating AI Pipelines: How NVIDIA Tools Boost Bing Visual Search Efficiency
Monday, March 17, 4:00 PM — 4:40 PM
S73232: Physical AI for the Next Frontier of Industrial Digitalization
Tuesday, March 18, 2:00 PM to 3:00 PM
S71145: Wired for AI: Lessons from Networking 100K+ GPU AI Data Centers and Clouds
Tuesday, March 18, 4:00 PM to 5:00 PM
S72355: Harnessing AI Agents for Enterprise Success: insights from AI Experts
Wednesday, March 19, 9:00 AM to 10:00 AM
S71521: Build Secure and Scalable GenAI Applications with Databases and NVIDIA AI
Thursday, March 20, 4:00–4:40 PM
S72435: Explore AI-Assisted Developer Tools for Accelerated Computing App Development
Friday, March 21, 9:00 AM- 10:00 AM
S72435: Connecting Industrial Data to Digital Twins with Microsoft Power BI, NVIDIA Omniverse, and OpenUSD
Thursday, March 20, 1:00 PM- 2:45 PM
Demo stations
Showed the latest AI solutions from Microsoft Azure and partners.
Azure AI Foundry
Azure AI infrastructure
NVIDIA on Azure
Innovation with Azure startups
Learn more about Azure AI
Azure AI Platform — Cloud AI Platform
On thursday evening, I also participated in the worlds shortest hackathon, hosted by the brev.dev team.