Question 1

What is Agent Firewall?

Accepted Answer

A deterministic security layer positioned at the execution boundary between an autonomous agent and its environment. It intercepts every tool call before execution, evaluates it against a policy set, and returns ALLOW, DENY, or REQUIRE_APPROVAL. Unlike content filters, an agent firewall operates on structured tool call arguments — not LLM text output — making it effective even when the model is compromised via prompt injection.

Question 2

What is Agent Runtime Security (ARS)?

Accepted Answer

The discipline of securing autonomous AI agents during their actual execution — after deployment and while actively processing tasks. ARS encompasses tool call interception, policy enforcement, audit logging, loop detection, budget controls, and human-in-the-loop mechanisms. It is distinct from model safety (training-time) and application security (network-layer), addressing the unique threat surface created by agents that execute real-world actions.

Question 3

What is Agent Scope?

Accepted Answer

The explicit set of tools and capabilities an agent is permitted to access for a given task or deployment context. Defined per-agent-role, a scope is the operational implementation of the least-privilege principle. A well-defined scope prevents privilege escalation, limits blast radius on compromise, and creates a documented record of intended agent behavior for compliance purposes.

Question 4

What is Audit Trail?

Accepted Answer

A complete, tamper-evident, chronological record of every tool call an agent attempted during a session — including the call arguments, the policy decision (ALLOW/DENY/REQUIRE_APPROVAL), the policy rule matched, outcome, latency, and timestamp. A comprehensive audit trail is required under EU AI Act Articles 12 and 17 for high-risk AI systems and is the primary evidence artifact for incident investigation.

Question 5

What is Budget Cap (Agent)?

Accepted Answer

A stateful control that limits the total resource consumption (LLM API token cost, number of tool calls, external API credits) an agent may incur within a single session or time window. Budget caps are enforced by a stateful firewall that accumulates usage across all calls in the session. When the cap is reached, the firewall terminates the session and logs the event. Essential for preventing runaway cost from infinite loops or adversarial inputs.

Question 6

What is Callback Handler?

Accepted Answer

A hook in an agent execution framework (e.g., LangChain's CallbackHandler) that fires at specific lifecycle events — tool call start, tool call end, LLM start, chain start, etc. SupraWall's interception layer is implemented as a callback handler, allowing it to intercept tool calls without modifying the agent's core logic. Callback handlers are the primary integration point for runtime governance in framework-based agents.

Question 7

What is Circuit Breaker (Agent)?

Accepted Answer

A stateful control that automatically halts an agent's execution when a predefined threshold is exceeded — typically a maximum number of sequential tool calls of the same type, or a maximum total call count per session. Modeled after the circuit breaker pattern in distributed systems, it prevents infinite loops and runaway execution by cutting the tool call pathway after a configurable threshold, returning control to the application layer.

Question 8

What is Compliance Export?

Accepted Answer

A structured export of audit logs, policy configurations, and session records formatted for use in regulatory conformity assessments. SupraWall's compliance export generates NDJSON and CSV artifacts aligned with EU AI Act Article 12 logging requirements, including agent identifier, action type, timestamp, policy version, and human oversight events. Intended for submission to notified bodies and internal compliance teams.

Question 9

What is Credential Injection?

Accepted Answer

The secure practice of providing API keys and secrets to agents at runtime through a controlled vault or environment injection mechanism — never through the prompt, conversation history, or agent-accessible memory. Credential injection ensures secrets are scoped to the agent's session, rotated per-execution if needed, and never exposed in logs or LLM context windows where they could be extracted.

Question 10

What is Deny-by-Default?

Accepted Answer

A security posture in which all agent tool calls are blocked unless there is an explicit ALLOW policy that matches the call. The inverse of allow-by-default (where everything runs unless explicitly blocked), deny-by-default prevents unauthorized actions even when policy coverage is incomplete. It is the recommended default for any production agent deployment and is a prerequisite for zero trust compliance.

Question 11

What is EU AI Act?

Accepted Answer

Regulation (EU) 2024/1689, the world's first comprehensive legal framework for artificial intelligence, which entered into force in August 2024. It classifies AI systems by risk level (unacceptable, high, limited, minimal) and imposes obligations including risk management systems (Article 9), data governance, technical documentation, transparency, human oversight (Article 14), accuracy and robustness, and logging (Article 12). Autonomous AI agents operating in consequential domains are likely classified as high-risk systems.

Question 12

What is Execution Boundary?

Accepted Answer

The interface between an autonomous agent's decision-making layer (the LLM) and the environment it acts upon (databases, APIs, filesystems, external services). The execution boundary is the correct enforcement point for security controls — intercepting calls here prevents any unauthorized action from reaching the environment, regardless of what the LLM decided. An agent firewall sits at the execution boundary.

Question 13

What is Guardrail?

Accepted Answer

A technical control applied to an AI agent's inputs or outputs to constrain its behavior within defined boundaries. Guardrails encompass both content filters (scanning LLM output for policy violations) and structural controls (blocking tool calls at the execution boundary). The term is used broadly across the industry; in the context of agent runtime security, guardrails specifically refer to controls that prevent agents from taking unauthorized actions rather than just producing inappropriate text.

Question 14

What is High-Risk AI System?

Accepted Answer

A classification under the EU AI Act for AI systems used in domains where failures have significant consequences for health, safety, or fundamental rights. Categories include biometric systems, critical infrastructure management, education and employment decisions, and law enforcement. AI agents autonomously making decisions in these domains are likely high-risk and subject to the full set of Article 9-17 obligations, including mandatory risk management, human oversight, and logging.

Question 15

What is Human-in-the-Loop (HITL)?

Accepted Answer

An architectural pattern in which a human reviewer must approve an agent's proposed action before it is executed. In agent runtime security, HITL is implemented via REQUIRE_APPROVAL policies that pause the agent's execution at a specific tool call, notify a designated approver, and resume execution only after explicit human authorization. HITL is required by EU AI Act Article 14 for high-risk AI systems performing consequential actions.

Question 16

What is Indirect Prompt Injection?

Accepted Answer

An attack in which malicious instructions are embedded in data the agent retrieves from the environment — web pages, documents, database records, API responses — rather than directly in the user prompt. The agent, treating the retrieved content as trusted context, follows the injected instructions. Indirect prompt injection is particularly dangerous for agents with broad tool access and is one of the primary threats that execution-boundary firewalls are designed to mitigate.

Question 17

What is Infinite Loop Detection?

Accepted Answer

A stateful control that identifies and terminates agent execution when the agent appears to be executing the same tool calls repeatedly without making progress toward task completion. Typically implemented by tracking call counts per tool per session and comparing argument hashes to identify repetition. When a loop is detected, the circuit breaker fires and execution is halted, preventing runaway resource consumption and cost accumulation.

Question 18

What is Integrity Hash?

Accepted Answer

A cryptographic hash of a policy document, audit log entry, or agent configuration that allows the original content to be verified as unmodified. In agent runtime security, integrity hashes are used to ensure that audit trails have not been tampered with after the fact — a requirement for regulatory admissibility. SupraWall generates SHA-256 hashes for each audit log batch and stores them separately to enable independent verification.

Question 19

What is Least Privilege (Agent)?

Accepted Answer

The principle that each agent should be granted access only to the minimum set of tools, APIs, and data sources strictly necessary for its designated task. In practice, this means defining per-agent-role scopes with explicit allowlists rather than inheriting broad application-level permissions. Least privilege limits the blast radius of compromised agents — a fully manipulated agent can only take actions within its explicitly permitted scope.

Question 20

What is LLM Guardrails?

Accepted Answer

Controls applied specifically at the language model layer — input filtering, output classification, content policy enforcement — to prevent models from generating harmful or policy-violating text. LLM guardrails are distinct from agent runtime guardrails: they operate on text tokens and are not able to intercept or block tool calls. See our research on why probabilistic LLM-as-judge guardrails fail for autonomous agents.

Question 21

What is LLM-as-Judge?

Accepted Answer

A security pattern in which a secondary language model is used to evaluate the intent or output of a primary agent model. While effective for content safety, this approach is probabilistic and vulnerable to bypass patterns like context window displacement and confidence hijacking. In agentic workflows, relying on a judge for security creates an execution gap that can only be closed by deterministic interception. Read the full technical breakdown of LLM-as-judge failure modes.

Question 22

What is MCP (Model Context Protocol)?

Accepted Answer

An open protocol developed by Anthropic that standardizes the interface between AI models and external tools, data sources, and capabilities. MCP defines a structured message format for tool calls, tool results, and resource access, enabling interoperability between different LLMs and tool implementations. MCP servers expose tool sets to agents; SupraWall can intercept MCP tool calls to enforce policies on any MCP-compliant agent.

Question 23

What is Multi-Agent Swarm?

Accepted Answer

An architecture in which multiple specialized autonomous agents collaborate to complete complex tasks, with agents calling other agents as tools. Swarms introduce compound security risks: a compromised orchestrator agent can manipulate sub-agents, inter-agent trust boundaries can be exploited, and individual agent scopes do not automatically restrict swarm-level behavior. Each agent-to-agent call in a swarm should be treated as a tool call subject to policy enforcement.

Question 24

What is PII Scrubbing?

Accepted Answer

The automated detection and redaction of personally identifiable information (name, email, phone number, national ID, financial account numbers) from agent inputs, outputs, and audit logs before they are stored or transmitted. PII scrubbing is a data minimization control required under GDPR and relevant to EU AI Act Article 10 data governance obligations. In agent systems, PII may appear in retrieved documents, API responses, or user prompts.

Question 25

What is Policy Engine?

Accepted Answer

The component of an agent firewall or runtime governance system responsible for evaluating tool calls against a defined rule set and returning a policy decision (ALLOW, DENY, REQUIRE_APPROVAL). A policy engine processes structured inputs — tool name, arguments, agent identity, session state — and applies deterministic rules in priority order. SupraWall's policy engine evaluates rules in under 5ms, making it suitable for insertion into the synchronous tool call path.

Question 26

What is Prompt Injection?

Accepted Answer

An attack in which malicious instructions embedded in user input or retrieved data override an agent's original system prompt, causing the agent to deviate from its intended behavior. Prompt injection exploits the fact that LLMs process all text in their context window as a single undifferentiated sequence — they cannot reliably distinguish between trusted instructions and untrusted data. Defense requires execution-boundary controls that enforce behavior independently of what the LLM was told.

Question 27

What is Rate Limiting (Agent)?

Accepted Answer

A control that caps the frequency with which an agent can execute a specific tool or class of tools within a time window. Unlike budget caps (which track cumulative cost), rate limiting tracks call frequency — for example, a maximum of 10 external API calls per minute. Rate limiting prevents agents from overwhelming downstream services, hitting API quotas, or executing high-frequency attacks against protected resources.

Question 28

What is REQUIRE_APPROVAL?

Accepted Answer

A policy action that pauses an agent's tool call execution and routes it to a designated human reviewer before proceeding. When a tool call matches a REQUIRE_APPROVAL policy, the agent's execution is suspended, the approver is notified via configured channels (Slack, email, dashboard), and execution resumes only if the human explicitly approves the action. Timeouts cause the call to be denied by default if no response is received within the configured window.

Question 29

What is Risk Score?

Accepted Answer

A numeric value (typically 0–100 or a categorical LOW/MEDIUM/HIGH/CRITICAL classification) assigned to a tool call or agent session that represents the assessed potential for harm. Risk scores are computed from factors including tool destructiveness, argument sensitivity, call frequency anomalies, and session context. High-risk calls can trigger automatic REQUIRE_APPROVAL escalation or DENY decisions even when no explicit policy matches.

Question 30

What is Runtime Governance?

Accepted Answer

The application of policy, monitoring, and control mechanisms to an AI agent's behavior during live execution — as opposed to design-time safety measures like prompt engineering or model fine-tuning. Runtime governance encompasses policy enforcement, audit logging, anomaly detection, human escalation workflows, and compliance reporting. It is the operational layer of the EU AI Act's Article 9 requirement for ongoing risk management throughout an AI system's lifecycle.

Question 31

What is Semantic Loop Detection?

Accepted Answer

An advanced form of loop detection that identifies repetitive agent behavior even when the specific tool arguments vary across calls. Rather than matching exact argument values, semantic loop detection compares the intent or semantic content of tool calls — detecting, for example, that an agent is repeatedly querying different variations of the same database search without making progress. Typically implemented using argument embedding similarity or n-gram analysis of the call sequence.

Question 32

What is Session Isolation?

Accepted Answer

The enforcement of strict boundaries between different agent sessions, ensuring that state, credentials, context, and audit records from one session cannot influence or be accessed by another. Session isolation prevents cross-session data leakage, ensures that budget caps and rate limits are accurately scoped, and guarantees that audit trails reflect the behavior of a single coherent execution context rather than a polluted shared state.

Question 33

What is Tool Call?

Accepted Answer

The fundamental unit of action in an autonomous AI agent system — a structured request from the LLM to invoke a specific function or external capability, containing the tool name and a JSON object of arguments. Tool calls are how agents interact with the world: querying databases, calling APIs, writing files, sending messages. Every tool call is a potential security event that should be evaluated by a policy engine before execution.

Question 34

What is Tool Interception?

Accepted Answer

The technical mechanism by which an agent firewall captures tool calls before they reach the underlying tool implementation. Interception is typically implemented by replacing tool functions with proxied versions that forward calls to the policy engine first. Tool interception is transparent to the agent framework — the agent's code does not need to be modified, and tool results are returned normally for allowed calls.

Question 35

What is Vault (Agent Secrets)?

Accepted Answer

A secure, access-controlled storage system for API keys, database credentials, and other secrets used by agents at runtime. Agent vaults provide time-limited, per-session secret injection — the agent receives a credential for the duration of its task and the credential is revoked afterward. This prevents secrets from being stored in prompts, environment variables, or conversation history where they could be exfiltrated by a compromised agent.

Question 36

What is Zero Trust (AI Agents)?

Accepted Answer

The application of the zero trust security model to autonomous AI agents: never implicitly trust an agent's intent or identity, and verify every tool call against an explicit policy before execution. Zero trust for agents treats each tool call as a potential security event regardless of the agent's stated purpose or previous behavior. It assumes that any agent may be compromised via prompt injection at any time and enforces controls at the execution boundary rather than at the trust perimeter.

AI Agent Security
Glossary.

Why Terminology Matters

Related Guides

Explore Agent Security Clusters

AI Agent Security Hub

GDPR AI Compliance

EU AI Act Readiness

AI Agent SecurityGlossary.