Accelerate Generative AI Inference on Amazon SageMaker AI with G7e Instances
Amazon AWS AI Blog / 4/21/2026
📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageIndustry & Market Moves
Key Points
- Amazon SageMaker AI now offers G7e instances with NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs for generative AI inference.
- Users can provision G7e nodes using 1, 2, 4, or 8 GPUs, with each GPU delivering 96 GB of GDDR7 memory.
- The launch enables organizations to run open-source foundation models on single-node setups such as the G7e.2xlarge instance.
- Example supported models include GPT-OSS-120B, Nemotron-3-Super-120B-A12B (NVFP4 variant), and Qwen3.5-35B-A3B, aiming to balance performance and cost effectiveness.
Today, we are thrilled to announce the availability of G7e instances powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs on Amazon SageMaker AI. You can provision nodes with 1, 2, 4, and 8 RTX PRO 6000 GPU instances, with each GPU providing 96 GB of GDDR7 memory. This launch provides the capability to use a single-node GPU, G7e.2xlarge instance to host powerful open source foundation models (FMs) like GPT-OSS-120B, Nemotron-3-Super-120B-A12B (NVFP4 variant), and Qwen3.5-35B-A3B, offering organizations a cost-effective and high-performing option.
Related Articles

Black Hat USA
AI Business

Black Hat Asia
AI Business
Adobe Just Made MCP an Enterprise Procurement Line Item
Dev.to
Explainable Causal Reinforcement Learning for precision oncology clinical workflows in hybrid quantum-classical pipelines
Dev.to
AI Photo Captions for Instagram: Stop Staring at the Blank Box
Dev.to