Getting Started with Adversarial Attacks on VLMs/VLAs for Humanoid Robots (Master’s Thesis Advice Needed)

Reddit r/LocalLLaMA / 4/19/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisTools & Practical Usage

Key Points

  • The post asks for guidance on how to begin researching AI security, specifically adversarial attacks, targeting VLMs/VLAs used in humanoid robots.
  • The author has some experience with jailbreaking LLMs but is new to adversarial techniques for vision-language (VLM) and vision-language-action (VLA) systems.
  • They have access to an NVIDIA Jetson Thor and are considering starting with an “unaligned” model for red-teaming before moving toward building defenses.
  • The author is also contemplating using NVIDIA Cosmos Reason 2 as a potential starting point and requests recommendations for papers, tools, and methodology.

Hey everyone,

I’m currently working on my master’s thesis on AI security for humanoid robots, with a focus on adversarial attacks for VLMs/VLAs. I’ve had some initial exposure to jailbreaking LLMs, but when it comes to VLMs and VLAs, I’m pretty new and honestly a bit unsure how to properly get started.

Right now I have access to an NVIDIA Jetson Thor, and I was thinking about starting with an unaligned model for red teaming purposes, then later moving on to building defenses. I’m also considering using NVIDIA Cosmos Reason 2 as a starting point.

At this stage, I feel like I have a few rough ideas but not a clear direction yet. If anyone has experience in this area or can suggest good starting points, papers, tools, or general methodology, I’d really appreciate it.

Thanks in advance!

submitted by /u/spacegeekOps
[link] [comments]