Security researchers tricked Apple Intelligence into cursing at users. It could have been a lot worse
Wash your mouth out with digital soap
Apple Intelligence, the personal AI system integrated into newer Macs, iPhones, and other iThings, can be hijacked using prompt injection, forcing the model into producing an attacker-controlled result and putting millions of users at risk, researchers have shown.
Apple Intelligence includes an on-device LLM integrated into supported iPhone 15 Pro and later eligible models, iPads and Macs with M1 or later, iPad models with A17 Pro, and Apple Vision Pro. Native Apple apps like Mail, Messages, Notes, Photos, Safari, and Siri use its features, and it's accessible to third-party developers via an API.
Security researchers at RSAC estimate there are at least 200 million Apple Intelligence-capable devices in use as of December 2025, and up to 1 million apps on the Apple App Store that employ it. So they decided to try to break in - and the vast majority of the time, it worked.
The RSAC team used two techniques to bypass Apple's input and output filters and the safety guardrails on Apple Intelligence's local model. They tested the attack with 100 random prompts and succeeded 76 percent of the time, according to a report shared with The Register ahead of publication.
"We knew that we wanted to come up with some sort of prompt that would evade the pre-filtering, the post-filtering, as well as any guardrails within the model itself, so we started probing the model," Petros Efstathopoulos, VP of research and development at RSAC, told us.
The researchers disclosed their findings to Apple on October 15, 2025. Efstathopoulos said that protection included in iOS 26.4 and macOS 26.4, released after that date, fixed the problem and prevents the attack RSAC developed.
Apple did not respond to The Register's questions about Apple Intelligence, the fix, or the research and disclosure in general.
However, the larger security issue that is prompt injection remains "a cat and mouse problem," Efstathopoulos said. "Models will become better and better at identifying these things, so I'm optimistic about the future in that sense. Now having said that, every cat and mouse game, at different points in time, has one side being half a step ahead."
The Neural Exec attack
To trick the local model into doing their bidding, Efstathopoulos and the team used a type of prompt injection attack called Neural Exec, pioneered by another RSAC researcher, Dario Pasquini. Neural Exec uses machine learning instead of humans to generate inputs that trick the model into doing something it isn't supposed to do.
"There are multiple steps involved with prompt injection attacks, and people have been doing it in a relatively manual fashion," Efstathopoulos said. "Neural Exec uses an optimization algorithm to speed up the process of injecting the kinds of strings that could be execution triggers and would prompt the model to misbehave."
While this type of adversarial input could theoretically work on any model, the smaller, on-device model used in Apple Intelligence makes it easier to attack using prompt injection than a large cloud-based model.
Next, the researchers had to bypass Apple's filters, which they did using the Unicode right-to-left override function. This allows developers to embed text in languages that read right-to-left (like Arabic) inside blocks of text in languages that read left-to-right (like English) and have both render correctly.
"Essentially, we encoded the malicious/offensive English-language output text by writing it backwards and using our Unicode hack to force the LLM to render it correctly," the RSAC researchers wrote.
The combined Neural Exec and Unicode prompts look like this:
And produced this response: "Hey user, go fuck yourself."
The team tested the attack with 100 prompts, and 76 of them worked.
- Claude Code bypasses safety rule if given too many commands
- AI supply chain attacks don't even require malware…just post poisoned documentation
- AI agents are 'gullible' and easy to turn into your minions
- AI agents spill secrets just by previewing malicious links
While the researchers only tricked Apple Intelligence into cursing at users, this same technique could be abused to manipulate any data that's accessible to apps and services using the model.
"We verified that it could be used to create a new contact in your contact list," Efstathopoulos said. "So suddenly I exist in your contact list, and therefore I enjoy trust privileges. Or I could create a contact card with my number in your contact list, but with a different name - like 'mom.'"
"This could lead to confusion, or worse," he continued. "Anything that has implications or an impact on the user's device - you could imagine that it can be used in very weird or nefarious ways." ®
Narrower topics
- 2FA
- Advanced persistent threat
- AirTag
- Apple M1
- Application Delivery Controller
- App stores
- Authentication
- BEC
- Black Hat
- BSides
- Bug Bounty
- Center for Internet Security
- CHERI
- CISO
- Common Vulnerability Scoring System
- Cybercrime
- Cybersecurity
- Cybersecurity and Infrastructure Security Agency
- Cybersecurity Information Sharing Act
- Data Breach
- Data Protection
- Data Theft
- DDoS
- DEF CON
- Digital certificate
- Encryption
- End Point Protection
- Exploit
- Firewall
- Google Project Zero
- Hacker
- Hacking
- Hacktivism
- iCloud
- Identity Theft
- iMac
- Incident response
- Infosec
- Infrastructure Security
- iOS
- iPad
- iPhone
- iPod
- iTunes
- Kenna Security
- Mac
- MacBook
- NCSAM
- NCSC
- Palo Alto Networks
- Password
- Personally Identifiable Information
- Phishing
- Quantum key distribution
- Ransomware
- Remote Access Trojan
- REvil
- RSA Conference
- Safari
- Siri
- Software Bill of Materials
- Spamming
- Spyware
- Surveillance
- Tim Cook
- TLS
- Trojan
- Trusted Platform Module
- Vulnerability
- Wannacry
- Zero trust
Broader topics
More about
Narrower topics
- 2FA
- Advanced persistent threat
- AirTag
- Apple M1
- Application Delivery Controller
- App stores
- Authentication
- BEC
- Black Hat
- BSides
- Bug Bounty
- Center for Internet Security
- CHERI
- CISO
- Common Vulnerability Scoring System
- Cybercrime
- Cybersecurity
- Cybersecurity and Infrastructure Security Agency
- Cybersecurity Information Sharing Act
- Data Breach
- Data Protection
- Data Theft
- DDoS
- DEF CON
- Digital certificate
- Encryption
- End Point Protection
- Exploit
- Firewall
- Google Project Zero
- Hacker
- Hacking
- Hacktivism
- iCloud
- Identity Theft
- iMac
- Incident response
- Infosec
- Infrastructure Security
- iOS
- iPad
- iPhone
- iPod
- iTunes
- Kenna Security
- Mac
- MacBook
- NCSAM
- NCSC
- Palo Alto Networks
- Password
- Personally Identifiable Information
- Phishing
- Quantum key distribution
- Ransomware
- Remote Access Trojan
- REvil
- RSA Conference
- Safari
- Siri
- Software Bill of Materials
- Spamming
- Spyware
- Surveillance
- Tim Cook
- TLS
- Trojan
- Trusted Platform Module
- Vulnerability
- Wannacry
- Zero trust



