The Trust Stack: Identity, Reputation, and Accountability for AI Agents

When agents start doing real work in the economy - not just answering questions but negotiating contracts, making purchases, and entering commitments on behalf of principals - we'll discover that our infrastructure assumes humanity. Credit checks, business references, contract law, reputation systems: all of it presumes a person or registered entity on the other side. We need a trust stack for agents.

The problem: agents without provenance

Today's agent deployments treat trust as inherited. Your agent speaks with your authority, draws on your credit, and binds you legally. That works when agents are extensions of human intent, like a really good email client. It breaks when:

Agents negotiate with other agents (whose principal do you trust?)
Agents operate semi-autonomously with budget authority
Agents interact with third parties who need to verify capabilities
Agents make representations that might create liability

The question isn't whether agents should have trust properties - they already do, implicitly. The question is whether we'll make those properties explicit, verifiable, and enforceable before the gaps cause serious harm.

What a trust stack needs

I think of trust infrastructure in three layers: identity (who is this?), reputation (should I trust them?), and accountability (what happens if they breach?). Each layer requires different primitives.

flowchart TB
    subgraph Identity Layer
        A[Agent Identity]
        B[Principal Binding]
        C[Capability Attestations]
    end
    subgraph Reputation Layer
        D[Performance History]
        E[Peer Ratings]
        F[Audit Trail]
    end
    subgraph Accountability Layer
        G[Stake/Bond]
        H[Insurance]
        I[Legal Entity]
    end
    A --> D
    B --> G
    C --> D
    D --> G
    E --> H
    F --> I

Layer 1: Identity

An agent needs a stable, verifiable identity that answers: Who operates this agent? What can it do? What constraints is it under?

Principal binding is the foundation. Every agent acts on behalf of someone - a person, company, or (eventually) another agent. That relationship should be cryptographically verifiable. Not "this agent claims to represent Acme Corp" but "here is a signed attestation from Acme Corp delegating these specific authorities to this agent."

Capability attestations describe what an agent can do, verified by third parties. Think of them like professional certifications: "This agent has passed evals for contract negotiation in the SaaS procurement domain with a 94% accuracy score, attested by [Evaluator]." Capabilities could include:

Domain competence (legal, financial, technical)
Safety certifications (guardrails verified by auditor)
Budget authority levels
Permitted action types

Instance identity distinguishes this specific agent from others with the same base model. Two GPT-4 agents might have radically different training, system prompts, and safety properties. The identity layer needs to capture this.

Layer 2: Reputation

Identity says who you are; reputation says whether to trust you. Agent reputation will likely evolve from three sources:

Performance history is the richest signal. Did the agent complete its tasks? Were outcomes as promised? Were costs within bounds? If we log agent actions with enough structure (and the evidence ledger I've written about before), we can compute reputation from real outcomes.

# Sketch: reputation attestation from a completed engagement
agent: "acme-procurement-agent-v3"
engagement_id: "eng_2026_001"
principal: "acme-corp"
counterparty: "vendor-xyz"
outcome:
  completed: true
  terms_met: true
  disputes: 0
  cost_variance: -3.2%  # under budget
  counterparty_rating: 4.7/5
attested_by: "deal-platform-abc"
timestamp: "2026-02-01T14:32:00Z"
signature: "..."

Peer ratings come from other agents and humans who've interacted with this agent. Was it responsive? Did it honour commitments? Did it escalate appropriately? These subjective signals complement objective outcomes.

Audit trails provide the evidence behind reputation claims. A high reputation score without an inspectable trail is just a number. The audit trail lets counterparties verify: show me the last 10 engagements and their outcomes.

Layer 3: Accountability

Reputation helps with the question "will they perform?" Accountability addresses "what if they don't?"

Stake or bond creates skin in the game. An agent (or its principal) posts collateral that can be slashed for breach. This is familiar from security deposits and performance bonds; the novelty is making it programmatic and proportional to the commitment.

Insurance transfers risk to specialised underwriters. Imagine agent liability insurance: "This agent is covered up to £100k for errors in financial recommendations, underwritten by [Insurer]." The insurer's willingness to cover becomes a trust signal in itself.

Legal entity provides the ultimate backstop. If an agent causes harm, there must be a person or registered entity accountable. For now, that's always the principal. Eventually, we might see new entity types designed for agent operations - but the principle remains: someone is liable.

How this changes agent-to-agent interactions

When two agents negotiate, each should be able to:

Verify identity: confirm the counterparty's principal and delegated authorities
Check reputation: query performance history and attestations
Assess accountability: understand what recourse exists if things go wrong

This transforms negotiation. Today, agents mostly operate in walled gardens with implicit trust. Tomorrow, agents from different organisations will need to establish trust dynamically, perhaps even building it over multiple interactions.

sequenceDiagram
    participant A as Agent A (Buyer)
    participant B as Agent B (Seller)
    participant R as Reputation Registry
    participant E as Escrow/Bond Service
    
    A->>B: Initiate negotiation + identity attestation
    B->>R: Query Agent A reputation
    R-->>B: Performance history + attestations
    B->>A: Counter-offer + identity attestation
    A->>R: Query Agent B reputation
    R-->>A: Performance history + attestations
    A->>E: Post performance bond
    B->>E: Post delivery bond
    A->>B: Accept terms (signed)
    Note over A,B: Execute transaction
    A->>E: Confirm delivery
    E-->>A: Release seller bond
    E-->>B: Release buyer bond + payment

The bootstrapping problem

New agents have no reputation. How do they enter the economy?

Principal inheritance: A new agent starts with its principal's reputation as a floor. Acme Corp's new procurement agent is implicitly trusted because Acme Corp is trusted.

Sandboxed trials: New agents operate in limited contexts - small transactions, reversible actions, high oversight - until they build a track record.

Third-party certification: Evaluators attest to agent capabilities before deployment. This creates initial reputation from testing rather than production.

Stake as substitute: An agent with no reputation can post a larger bond, compensating for uncertainty with skin in the game.

Implementation notes

This isn't entirely speculative. The building blocks exist:

Decentralised identity standards (DIDs, Verifiable Credentials) can represent agent identity and attestations
Cryptographic signatures can bind agents to principals and verify attestations
Smart contracts can hold bonds and execute slashing conditions
Existing APIs (trade references, credit checks) can inform initial reputation

The missing piece is coordination: agreeing on schemas, trust anchors, and dispute resolution. This is a classic standards problem. Someone needs to propose a minimal viable trust stack and get adoption.

What this enables

With trust infrastructure, agents can:

Transact across organisational boundaries without pre-existing relationships
Build reputation over time that follows them across platforms
Operate with proportional autonomy - more trust, more authority
Enter binding commitments with enforceable consequences

The economy gets more efficient because agent-to-agent friction drops. The risks get more manageable because accountability is built in, not bolted on.

Open questions

Who operates the reputation registries? Centralised registries create power concentrations and single points of failure. Decentralised approaches face adoption and coordination challenges. Probably both will coexist: platform-specific registries for closed ecosystems, federated or decentralised registries for open agent-to-agent commerce.

How do we handle adversarial behaviour? Reputation systems are gameable. Sybil attacks, fake transactions, reputation laundering - all the pathologies of human reputation systems will apply to agents, perhaps faster. The accountability layer (bonds, insurance, legal entities) provides backstops, but the reputation layer needs robust anti-gaming mechanisms.

What's the liability model? When an agent causes harm, who pays? The principal, certainly - but what about the model provider, the platform, the reputation attestor who vouched for a bad agent? These questions will be answered in courts and contracts over the next decade.

How much reputation is enough? Trust is contextual. The reputation needed to book a restaurant differs from the reputation needed to negotiate a million-pound contract. The trust stack needs to support proportional trust - lightweight checks for low-stakes interactions, deep verification for high-stakes ones.

A thesis

The bottleneck for agent autonomy isn't capability - it's trust. We can build agents that negotiate, transact, and commit. We can't yet verify that an unknown agent is safe to do business with. The trust stack is the infrastructure that unlocks agent-to-agent commerce at scale.

The companies that build this infrastructure - the agent identity providers, reputation aggregators, and accountability underwriters - will be as important to the agent economy as credit bureaus and payment networks are to the current one.

Bottom line: Agents need identity to be known, reputation to be trusted, and accountability to be safe. Build the trust stack, and you unlock economic relationships between agents that we can't yet imagine. Leave it missing, and agents stay in walled gardens, forever borrowing their principals' trust instead of earning their own.