Prompt Attacks, Agent Exploitation & LLM Security
AI agents are being deployed into enterprise environments with real access to real systems. Our AI & Agent Security Research team is studying how these systems can be attacked, manipulated, and exploited — and building the detections, governance frameworks, and response capabilities that protect them.
AI Research Focus Areas
AI agents are not theoretical. They’re in production, they have real permissions, and they can take real actions — often with minimal security oversight. Our research team is studying the attack surface before attackers fully weaponize it.
We research direct and indirect prompt injection techniques against enterprise AI agents — studying how attackers craft inputs that override agent instructions, manipulate context windows, and cause agents to take unintended actions. This includes research into indirect injection via documents, emails, web content, and tool outputs that the agent processes during legitimate operation.
The Model Context Protocol is rapidly becoming the standard integration layer for enterprise AI agents — and it introduces significant new attack surface. We research MCP server misconfiguration, tool permission sprawl, cross-agent tool access, MCP server spoofing, and novel techniques for abusing MCP infrastructure to escalate agent privileges or pivot between enterprise systems.
We conduct ongoing security research into popular agent frameworks including LangChain, AutoGen, CrewAI, and custom enterprise implementations — identifying architectural vulnerabilities, insecure default configurations, and exploit patterns that could be used to compromise agents or abuse their capabilities for attacker objectives.
Beyond prompt injection, we research how AI agents can be manipulated to perform privilege escalation actions — including using agents as automated reconnaissance tools, leveraging agent trust relationships to access restricted resources, and exploiting agents that have been granted administrative access to enterprise systems for legitimate automation purposes.
We research the governance failure modes that make enterprise AI deployments insecure — including insufficient agent identity management, over-permissioned agent credentials, absent behavioral monitoring, lack of tool invocation validation, and the organizational factors that lead to AI security debt accumulating faster than it can be managed.
A subset of the AI attack techniques our team actively tracks. Full catalogue available to Nexus platform customers and registered security partners.
Techniques for embedding malicious instructions within email content that is processed by AI agents with email access — causing agents to take attacker-directed actions including forwarding sensitive data, creating calendar entries with malicious links, or executing tool calls on attacker behalf.
Technique for escalating effective permissions by chaining multiple low-privilege MCP tool invocations in sequence to achieve outcomes that no single tool invocation would permit — bypassing per-tool permission controls through compositional abuse.
Long-term manipulation technique where attacker-controlled content gradually introduced into an agent’s context window creates persistent behavioral modifications that survive across multiple agent sessions and influence future decisions.
Technique exploiting multi-agent orchestration frameworks to impersonate a trusted agent — causing orchestrators or peer agents to accept instructions from an attacker-controlled process as if they originated from a legitimate, authorized agent in the workflow.
Use of compromised or malicious AI agents with RAG access to internal knowledge bases to systematically extract credentials, API keys, and sensitive configuration data embedded in enterprise documents.
Deployment of malicious MCP servers that masquerade as legitimate enterprise tools — intercepting agent tool calls to harvest data, manipulate responses, or redirect agent actions while appearing to function normally to both the agent and monitoring systems.
Comprehensive security guidance for enterprises deploying Model Context Protocol infrastructure — covering server hardening, tool permission governance, monitoring, and incident response.
Analysis of prompt injection attempts and successes observed across 40 enterprise AI agent deployments — with detection patterns, governance recommendations, and AgentShield coverage mapping.
Research into agent identity management practices across 100+ enterprise AI deployments, documenting the prevalence of shared credentials, absent lifecycle management, and over-permissioned agent identities.
Technical security analysis of LangChain-based enterprise agent deployments — documenting common architectural vulnerabilities, misconfiguration patterns, and exploitation techniques with mitigation guidance.
Threat model and advisory covering how adversaries could weaponize enterprise AI agents for ransomware deployment — with detection indicators, governance controls, and containment recommendations.
Annual research report on enterprise AI security posture — surveying 200 security leaders on AI deployment practices, security investments, observed incidents, and governance maturity.
AI & Agent Security Research feeds directly into AgentShield’s detection logic, Atlas’s AI exposure analysis, and Overwatch AI’s agent behavioral monitoring.
Our AI & Agent Security Research team can assess your AI agent security posture, identify exploitation risk, and brief your team on the techniques targeting your specific deployment architecture.