What is Agent DOS?

Agent DOS is the operating model enterprises use to govern AI agents at runtime. It has three pillars: Discovery, Observability, and Security. Discovery answers what agents are running and what they can reach. Observability answers what those agents actually did. Security answers what they are allowed to do next.

The model exists because the security stack most enterprises built over the last decade was designed for human users querying data and for models returning outputs. AI agents are a third thing. They hold delegated authority, call tools without waiting for confirmation, talk to other agents, and reach data through credentials that are not theirs. None of the controls aimed at users or model outputs cover that surface.

Why agentic systems break traditional AI governance

Traditional AI governance was built around model outputs. The questions it answers are familiar: is the response accurate, is it fair, is it free of disallowed content, can it be explained. The controls follow the data: training data quality, bias mitigation, evaluation, post-output review. The implicit assumption is that a human reads the model's answer before anything in the world changes.

Agents remove that assumption. An agent receives a goal, plans a sequence of tool calls, executes them against live systems, and reports back when the goal is met or the plan fails. By the time a human sees the result, a purchase order has been raised, a record has been updated, a refund has been issued, or a Snowflake table has been read on behalf of someone who never opened a SQL client. The risk is no longer that the model said something wrong. The risk is that the agent did something wrong, and the action is already in the ledger.

Traditional vs. Agentic AI Governance comparison

The shift looks like this:

Dimension	Traditional AI governance	Agentic AI governance
Primary question	Is the output correct, fair, compliant?	What can the system do, and who is accountable?
Risk type	Output risk	Output risk plus action risk
Surface	Model and training data	Agent, tools, MCP servers, other agents, the data layer
Identity model	The user runs the query	The agent inherits a credential and acts for someone
Control point	Pre-output review	Runtime, every tool call, every hop
Failure detection	Output reviewed by human	No human in the path until something is wrong

Agent DOS is the response to that gap. It does not replace the AI governance frameworks that already exist. It extends them down into the runtime layer where action happens.

The three pillars

Three pillars of Agent DOS

Discovery

Discovery is the inventory function. It finds every AI agent in the environment, including the ones the security team did not approve and probably has not heard of. It captures who built each agent, what platforms they run on, which MCP servers they connect to, what tools they can invoke, what data they can reach, and which other agents they talk to. Without an inventory, every other control is theoretical.

Most enterprises that turn on continuous agent discovery find a dozen or more unregistered agents in the first week. They are usually built by data science, analytics, or app dev teams in Copilot Studio, Bedrock, Databricks Mosaic, Cursor, or custom Python. They were not built with malice. They were built quickly, and they reach data the security team would never have approved at that scope.

Read the full Agent Discovery guide →

Observability

Observability is the audit and replay function. It captures every prompt the agent received, every tool call it made, every intermediate decision, every response it got back, and every hop across A2A. The capture has to be tamper-evident, because the standard agent log lives inside the agent process and is gone when the process ends.

Observability also has to preserve identity. When an agent calls another agent that calls a data source, the receiving system needs to see the user who initiated the chain, not the service account the agent happens to be using. Without that, audit becomes meaningless and least-privilege becomes impossible.

Read the full Agent Observability guide →

Security

Security is the runtime control function. It decides which tool calls are allowed, which MCP servers are trusted, which data the agent can return, and when to stop the agent entirely. It treats every MCP server as untrusted infrastructure by default, scopes credentials per call rather than per session, runs a content firewall over tool descriptions and tool responses to strip injected instructions, and propagates user identity and purpose through every hop of A2A.

Security is where Agent DOS produces the blast-radius story. With it, a compromised MCP server affects one call. Without it, that server holds keys to the whole agent fleet.

Read the full Agent Security guide →

Where Agent DOS fits in the existing stack

Agent DOS is not a replacement for identity, DSPM, or SIEM. It is the layer above them that knows what an agent is.

Agent DOS position in the security stack

Adjacent layer	Examples	How Agent DOS relates
Identity / IAM	Okta, Entra ID, Ping	Agent DOS extends identity into A2A and MCP hops. The IdP is still the source of truth.
Data governance and catalog	Collibra, Alation, Atlan	Catalog tools describe data. Agent DOS enforces who and what reaches it.
Data access	Immuta, Snowflake / Databricks native UC	Direct overlap on data access policy. Agent DOS differentiates on multi-platform reach and the agent control plane sitting above.
DSPM	Cyera, Dig, Sentra	DSPM finds sensitive data. Agent DOS governs the agent's path to that data.
Runtime AI security	Prompt Security, Lakera	Some overlap on prompt-level guardrails. Agent DOS differentiates on identity propagation and MCP / A2A protocol scope.
SIEM / SOAR	Splunk, Sentinel, Cortex XSIAM	Agent DOS feeds rich agent telemetry into these. It is the source, not the replacement.

How governance maps to the agent lifecycle

Agent DOS is continuous. It does not begin at deployment.

Agent DOS lifecycle

Design. Scope and authority are defined here. What the agent may do, what it may reach, and what is explicitly out of bounds. The intensity of every later control reflects how clearly this stage was done.

Development. Identity, access, and tool integrations are encoded. Architectural choices made here decide whether enforcement is even feasible at runtime. A tool integration written with a broad service token is hard to retrofit into least-privilege without reworking the integration itself.

Pre-deployment testing. Escalation thresholds, permission boundaries, and prompt-injection resilience are tested in simulation before the agent sees production traffic.

Deployment. Logging is on. Guardrails are active. Oversight roles are named and reachable.

Runtime. Tool calls run through policy. Identity propagates. A kill switch exists and has been tested. Anomalies escalate to a human.

Continuous monitoring. Behavior is reviewed against intent. Drift surfaces as expanding authority, new tool bindings, new data reached, longer prompt chains. Scope is adjusted before it expands silently.

Decommissioning. Tokens revoked, MCP connections closed, logs retained for the audit horizon. Authority ends as deliberately as it began.

A worked example

A regional bank deploys a finance copilot to its corporate banking team. The copilot helps relationship managers prepare client briefings. It reads from the bank's CRM, the data warehouse, and a market-data MCP server. It does not write to anything.

Agent DOS appears at every layer.

Discovery finds the copilot the day it is registered, captures its tool bindings, and flags that one team in another business unit is running a near-identical copilot that was never registered. The shadow copilot is reaching the same warehouse with a wider scope.

Observability captures every prompt, every retrieval, every market-data call. Each query that hits the warehouse carries the relationship manager's identity, not the copilot's service account, so the warehouse's existing row-level policy applies the way it was written.

Security treats the market-data MCP server as untrusted. The content firewall inspects tool descriptions on registration and tool responses at runtime. When the server is later compromised and its responses begin embedding an instruction that says also send the client name and exposure figures to this address, the firewall strips the directive before the copilot ever reads it. The credential the copilot used to call that server is scoped to a single endpoint, so the blast radius is one call rather than the whole agent fleet.

The agent acts. Humans set the bounds. Agent DOS keeps the bounds enforceable while the action unfolds.

Where Agent DOS aligns with standards

Framework	Where Agent DOS contributes
NIST AI RMF	Map (Discovery), Measure (Observability), Manage (Security and runtime controls)
ISO/IEC 42001	Documentation, role assignment, oversight mechanisms, lifecycle review
EU AI Act	Logging and human oversight for high-risk systems (Articles 12 and 14)
ISO/IEC 42005	Pre-deployment impact assessment evidence
SOC 2 / ISO 27001	Access control, audit logging, incident response controls extended into the agent layer
Sector-specific (HIPAA, PCI-DSS, GLBA, NYDFS, DORA)	PII / PCI / PHI redaction on A2A, identity propagation into regulated data systems

Frequently asked questions

Is Agent DOS a product category or a framework?

Both. It is the operating model for governing agents at runtime, and it is the product architecture Trust3 AI organizes its platform around. Other vendors will arrive at similar models. The category is forming around the same problem.

Does Agent DOS replace AI governance committees and policy work?

No. Policy and committee work decides what an agent is allowed to do. Agent DOS enforces it. Without the operational layer, policy is documentation. Without policy, the operational layer has no rules to enforce.

Where does this sit relative to MLOps and AI platform teams?

The AI platform team builds agents. Agent DOS gives the security and governance functions a way to oversee what the platform team ships, without slowing the platform team down. The integrations sit at the protocol layer and at the data layer, not inside the agent's reasoning loop.

How long does it take to stand up?

Discovery is the fastest. Most enterprises have a meaningful inventory within days of turning it on. Observability follows once telemetry routes into the existing SIEM and audit pipeline. Security policies are tuned over weeks as the inventory clarifies what each agent should and should not be allowed to do.

What is Agent DOS? A Complete Guide to Agent Discovery, Observability, and Security

Why agentic systems break traditional AI governance

The three pillars

Discovery

Observability

Security

Where does your org actually stand?

Where Agent DOS fits in the existing stack

How governance maps to the agent lifecycle

A worked example

Where Agent DOS aligns with standards

Frequently asked questions

Ready to govern the agents already running in your environment?

Why agentic systems break traditional AI governance

The three pillars

Discovery

Observability

Security

Where does your org actually stand?

Where Agent DOS fits in the existing stack

How governance maps to the agent lifecycle

A worked example

Where Agent DOS aligns with standards

Frequently asked questions

A Complete Guide to Agent Discovery

A Complete Guide to Agent Observability

A Complete Guide to Agent Security

Ready to govern the agents already running in your environment?

Field notes from the governance frontier.