Cost effective deployment of vision-language models for pet behavior detection on AWS Inferentia2

Amazon AWS AI Blog / 5/7/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageIndustry & Market Moves

Key Points

  • Tomofun, a pet-tech startup behind the Furbo Pet Camera, is improving remote pet interactions by deploying vision-language model capabilities for pet behavior detection.
  • The company reduced inference costs and preserved accuracy by running workloads on EC2 Inf2 instances backed by AWS Inferentia2, Amazon’s purpose-built AI chips.
  • The article explains its approach in a step-by-step, cost-focused deployment context, emphasizing practical implementation details rather than a purely theoretical discussion.
  • By choosing specialized hardware for inference, Tomofun demonstrates a path for keeping real-time or near-real-time pet analytics affordable for end users.
Tomofun, the Taiwan-headquartered pet-tech startup behind the Furbo Pet Camera, is redefining how pet owners interact with their pets remotely. To reduce costs and maintain accuracy, Tomofun turned to EC2 Inf2 instances powered by AWS Inferentia2, the Amazon purpose-built AI chips. In this post, we walk through the following sections in detail.