Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications: Introduces Morris-II, a self-replicating “AI worm” that exploits RAG/GenAI pipelines by embedding adversarial, self-replicating prompts which cause GenAI apps to both execute malicious payloads and propagate the prompt to other agents. The paper demonstrates feasibility in controlled environments and proposes detection/mitigation (the “Virtual Donkey”) to detect propagation. (Stav Cohen, Ron Bitton, Ben Nassi and collaborators).
Ransomware 3.0: Self-Composing and LLM-Orchestrated: A proof-of-concept study showing how LLMs can autonomously orchestrate full ransomware campaigns: reconnaissance, synthesis of payloads (code), environment-specific adaptation, exfiltration/encryption, and personalized extortion. The work demonstrates the economic feasibility of LLM-driven ransomware and argues for new behavioral/telemetry defenses. (Md Raz, Meet Udeshi, P. V. Sai Charan, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri)
Multimodal Prompt Injection Attacks: Risks and Defenses: Systematic study of prompt-injection threats when inputs are multimodal (text + images + other modalities). Identifies new attack vectors that bypass text-only defenses (for example, embedding malicious instructions in images or mixed content) and evaluates mitigation strategies — useful reading for defenders building multimodal LLM apps
Prompt Injection 2.0: Hybrid AI Threats: Extends prompt-injection analysis to hybrid attacks that combine classical web/vulnerability techniques (XSS, CSRF, etc.) with prompt-injection to escape sandboxing and exfiltrate data. The paper analyzes attack chains, demonstrates proof-of-concepts, and evaluates defensive measures that bridge web security and LLM guardrails.
Revealing a Hidden Class of Task-in-Prompt Adversarial Attacks (PDF): Presents and characterizes Task-in-Prompt (TIP) attacks — adversarial inputs that appear as innocuous tasks but cause LLMs to perform unintended or harmful actions. The paper provides taxonomy, attack generation techniques, responsible disclosure details, and recommended mitigation guidance for model builders and integrators. This paper was presented at ACL and has sparked active discussion in the NLP/AI safety community. (S. Berezin et al.)
A Survey on Model Extraction / Model-Stealing Attacks and Defenses for Large Language Models: A comprehensive survey and taxonomy of model extraction attacks against deployed LLMs (functionality extraction, training-data extraction, prompt-targeted attacks), plus an overview of defensive techniques (rate-limiting, watermarking, API-level defenses). This survey is gaining traction as practitioners scramble to protect proprietary models and user privacy. (K. Zhao et al.)