Accelerate Generative AI Inference on Amazon SageMaker AI with G7e Instances

Amazon AWS AI Blog / 4/21/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageIndustry & Market Moves

Key Points

  • Amazon SageMaker AI now offers G7e instances with NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs for generative AI inference.
  • Users can provision G7e nodes using 1, 2, 4, or 8 GPUs, with each GPU delivering 96 GB of GDDR7 memory.
  • The launch enables organizations to run open-source foundation models on single-node setups such as the G7e.2xlarge instance.
  • Example supported models include GPT-OSS-120B, Nemotron-3-Super-120B-A12B (NVFP4 variant), and Qwen3.5-35B-A3B, aiming to balance performance and cost effectiveness.
Today, we are thrilled to announce the availability of G7e instances powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs on Amazon SageMaker AI. You can provision nodes with 1, 2, 4, and 8 RTX PRO 6000 GPU instances, with each GPU providing 96 GB of GDDR7 memory. This launch provides the capability to use a single-node GPU, G7e.2xlarge instance to host powerful open source foundation models (FMs) like GPT-OSS-120B, Nemotron-3-Super-120B-A12B (NVFP4 variant), and Qwen3.5-35B-A3B, offering organizations a cost-effective and high-performing option.