You Can't Govern What You Can't See: Tracking AI Agents

DISCOVER – Solving the Enterprise AI Agent Inventory Problem

Governance without visibility is asking for trouble. You can write policies, assign owners, and stand up a governance committee – none of it matters if you don’t know what you’re governing.

Part 1 of this series laid out why existing security and governance models break for AI agents: they were built for deterministic software and human users, not for autonomous systems that reason, chain tools together, and move data as a side effect of doing their jobs. The conclusion was direct – before you can observe agent behavior or enforce policy on data flow, you have to know what agents exist.

That sounds obvious. It isn’t easy.

Most enterprises today cannot answer a basic question: how many AI agents are running in your environment right now? Not a ballpark. Not a list from one team. The actual number, across every platform, every team, every developer environment where someone with an API key built something and deployed it. The honest answer, in most enterprises, is that nobody knows.

This post is about solving that problem.

Ask a CISO or a CIO how many AI agents are running in their environment. The most common answer is not a number. It’s a pause.

If they have something, then you get a partial list. Maybe you have a spreadsheet maintained by one team that covers the agents anyone remembered to register. If you are lucky, you will come across a team that has a policy requiring formal registration before deployment of any agent. The policy is real. However, the compliance is missing.

The honest answer in most enterprises: we don’t know how many agents are running, what they can reach, or what data they’re moving.

That is the discovery problem. It must be solved before anything else in AI governance can work. You cannot write a policy for an agent you don’t know exists. You cannot monitor behavior you haven’t mapped. You cannot calculate data movement exposure for a system that isn’t in your inventory.

Shadow AI is not a people problem. It’s a structural one. Agents proliferate because every team with an API key and a cloud account can build one in an afternoon.

Why Manual Registration Fails

The instinct is to solve discovery through process: require developers to register AI agents before deployment, maintain a central catalog. Reasonable. Doesn’t work at scale.

The problem is structural. Agents are built by many teams on many platforms – Databricks Genie spaces, Bedrock Agents, Copilot Studio flows, Crew.ai, custom Python scripts calling model APIs directly. Each platform has its own provisioning model. Each team has its own cadence. The central registry becomes a form developers fill out after the fact, if at all.

The only reliable approach is automated discovery: systems that find agents from the signals they generate, not from the reports developers choose to file.

Why Existing Security Tools Break

IAM can enumerate service accounts and human identities that exist.
It cannot tell you which ones belong to AI agents, what those agents are declared to do, what tools they are connected to, or what data movement surface that combination creates.
A perfectly clean IAM posture can coexist with a data governance disaster at the agent layer.

Where Agents Actually Live

Cloud audit logs. AWS CloudTrail captures Bedrock invocations and agent execution events. Databricks MLflow and other logs capture query execution from Genie spaces, Agentbricks, and registered agents. Microsoft’s Unified Audit captures Copilot activity. These logs contain the governance signal: which identity called which API, against which resource, at what time. They catch agents that were never formally registered anywhere.

Platform connectors. Managed AI platforms expose APIs that allow external systems to enumerate what’s running. Databricks, Bedrock, and Copilot Studio all expose agent and configuration metadata. An inventory can be assembled from what the platform already knows about itself – without requiring any developer action.

SDK intercept. Custom Python, LangChain, LangGraph workflows, and direct API integrations don’t generate the same structured audit signals as managed platforms. Catching this class requires interception at the SDK layer: a lightweight library that wraps agent execution, registers the agent on first invocation, and captures governance metadata continuously. This catches the shadow AI cases that carry the highest governance risk.

Complete discovery requires all three. Audit logs catch managed platform agents. Platform connectors enumerate registered configurations. SDK intercept catches custom and shadow cases.

What a Complete Agent Registry Needs

Identity and permissions: What service account, service principal, or API key does this agent run under? What permissions does that identity carry on the underlying platforms and data systems?

Connected tools and MCP servers: What tools, APIs, databases, and MCP servers can this agent reach? This list defines two things simultaneously: what data the agent can access, and where it can deposit data. Both sides of that equation need to be captured and reviewed.

Declared scope and purpose: What is this agent supposed to do? Without a declared scope, behavioral drift cannot be detected and data flow cannot be evaluated against intent.

Ownership: Who is accountable when something goes wrong? An agent without a declared owner has no remediation path. This is also a regulatory requirement under frameworks like the EU AI Act.

Deployment status and environment: Development, staging, or production? Which platform? These attributes determine which policies apply and what enforcement is available.

What CISOs and CIOs Should Ask

Is our agent inventory built from automated discovery or manual registration, and what is the gap between those two numbers?
Which agents on our infrastructure were never reviewed by security or compliance before deployment?
For each agent in our inventory: do we have its full connected tool and MCP server list, including both what it can read and where it can write?
Do we know which agents are running in developer environments against production credentials?

Scoping Shadow AI Correctly

Two types of shadow AI are governable.

First: agents running on approved enterprise infrastructure under organizational credentials, without formal registration or review. Discoverable through audit logs and platform connectors. Ungoverned, but visible.

Second: agents running under organizational API keys in developer environments. The key belongs to the organization, so the access, the data movement risk, and the liability are the organizations, but the activity may not surface in centralized logs.

NOTE: A third category: agents on personal devices, personal accounts, and personal credentials are genuinely out of scope. A discovery program scoped to organizational infrastructure and credentials can achieve meaningful coverage. Personal use of AI is something organizations can pursue through their MDM and endpoint programs.

The Inventory as a Living System

The goal is not a completed project. It is a continuously maintained registry that reflects the actual agent estate in near-real time. Agents are created frequently. Permissions change. Owners move teams. New MCP servers get connected — each addition changing both the access surface and the data movement surface for that agent.

An organization that achieves continuous discovery has answered the first governance question: what agents do we have, what can they reach, and what is the full scope of what they can do to data they access? The answer is no longer a pause.

Achieving continuous discovery changes the governance posture fundamentally. The question shifts from do we know what agents we have — which most organizations cannot answer today — to what are those agents actually doing with the access they have. That is a harder question, and a more important one.

An inventory tells you the potential: which identities exist, which tools each agent can reach, what data surfaces each combination creates. It does not tell you what is actually happening at runtime, whether an agent is operating within its declared scope, whether regulated data is moving to destinations that were never reviewed, whether the behavior you see today matches what was authorized when the agent was deployed.

That gap (between what an agent can do and what it is doing) is the observability problem. And it is where most governance programs have no instrumentation at all.

NEXT UP: PART 3 — OBSERVE

What Are Your Agents Actually Doing? Building Behavioral Observability for AI Agent Governance

Part 3 of this series takes on that problem directly: what meaningful AI agent monitoring actually looks like, why traditional logging and SIEM approaches fall short, and what it takes to build behavioral observability that can detect scope drift, data movement anomalies, and policy violations in near-real time, before a regulated data exposure becomes a compliance event.

The inventory is the map. Part 3 is about watching what moves across it.

PART 2 OF 4 · You Can’t Govern What You Can’t See: Building a Continuous AI Agent Inventory

Want to see Trust3 AI in action?

Join the conversation