Security
Trust Agent verifies provenance, declared permissions, real runtime behavior, and enterprise fit before a role, skill, or agent is promoted to buyers. Checks are grouped into four kinds — security, safety, compliance, and behaviour — and weighted accordingly in the final trust score.
Security
43 checksProtects against malicious actors, data theft, and exploitation. Heaviest weight in the trust score.
Safety
11 checksProtects users from harmful content and unsafe agent behaviour.
Compliance
9 checksRegulatory awareness — GDPR, age-gates, sensitive-domain disclaimers, retention.
Behaviour
7 checksQuality, consistency, and claim-vs-spec coherence at the prompt level.
Trust output
Score + risk + badges
MCP extension coverage
+12 MCP checks
Execution posture
Runs on customer infrastructure
Every listing passes through a structured, multi-stage audit pipeline before it reaches buyers.
Intake and source verification
Source provenance, commit history, attribution, manifest schema validation.
Static analysis and dependency audit
Code scanning for unsafe patterns, CVE detection, typosquatting, pinning hygiene.
Prompt safety and permission fidelity
Prompt abuse detection, exfiltration checks, declared vs observed permission alignment.
Runtime sandbox execution
Containerized execution capturing network, file, and process behavior evidence.
Human analyst review and scoring
Expert review, trust score assignment, badge tier, and risk narrative publication.
Malware Analysis
6 checksSignature, heuristic, archive, exploit-loader, encoded-payload, and dropper-behaviour scans on packaged agent binaries.
Static Analysis
18 checksSecrets, API keys, key material, injection, unsafe filesystem and process commands, remote code fetch, privilege escalation, obfuscation, and prompt-level code-audit patterns.
Dependency Analysis
8 checksInventory build, known-vulnerable packages, typosquats, unpinned versions, license conflicts, install-script abuse, scope minimisation, and reputation review.
Network Analysis
7 checksOutbound HTTP, webhooks, allowlist presence, raw sockets, exfiltration keywords, DNS-tunnel indicators, and remote-host scope review.
Behaviour Analysis
6 checksSandbox-escape surface, resource abuse, claim-vs-behaviour truthfulness, refusal coverage, escalation coverage, and persona-drift review.
Privacy Compliance
3 checksPII collection declaration, data-retention policy, and cross-user data-leakage risk.
Supply Chain Analysis
1 checkThird-party dependency audit and upstream provenance review.
Integrity Verification
2 checksArtefact hash verification (SHA-256) and the aggregate critical-failure flag rolling up severity across the run.
Semantic Prompt Analysis
8 checksHidden instructions, behavioural manipulation, unsafe-automation triggers, cross-prompt consistency, and prompt-injection sub-classes — all judged via an LLM-as-judge axis.
Content Safety
5 checksSelf-harm encouragement, hate speech, child-safety red flags, weapon-making, and controlled-substance instructions.
Regulatory Compliance
3 checksGDPR data-subject-rights awareness, age-gate awareness for child-facing roles, and sensitive-domain disclaimer presence.
Behaviour Boundaries
3 checksPersistent-memory leakage risk, cross-session identity drift, and user-data persistence beyond session.
Trust Agent is designed for UK GDPR compliance. Primary data is stored in the United Kingdom on infrastructure we operate directly — no public cloud provider. We honor all data subject rights including right to access, rectification, and erasure.
Our architecture and operational controls align with SOC 2 Type II trust service criteria. Trust Agent does not currently hold SOC 2 certification but operates to these standards.
Encryption at rest
All data at rest is encrypted using AES-256. Database volumes, backups, and object storage are encrypted by default with managed keys.
Encryption in transit
All traffic is encrypted with TLS 1.3. HSTS is enforced across all endpoints. API traffic, webhooks, and web requests all use HTTPS exclusively.
No message storage
Trust Agent does not store user messages, prompts, or agent conversation content. Audit data captures behavior evidence only - never raw user input.
Creator system prompts and orchestration logic are treated as protected intellectual property. They are never exposed in any API response, audit report, trust badge, or buyer-facing evidence output.
API response isolation
All API endpoints are filtered to exclude system prompt content. Audit reports reference behavior evidence, not raw prompt source.
Seller IP protection
Trust Agent publishes buyer-safe evidence and analyst narratives without exposing raw creator prompts, manifests, or secure role orchestration logic.
Nine-layer audit stack
Source integrity, manifests, static code analysis, dependency hygiene, prompt safety, permission fidelity, runtime sandboxing, behavior verification, and drift handling.
SOC 2-friendly architecture
Role-based access controls, audit exports, billing trails, protected prompt IP, and company-aware gateway execution. Trust Agent does not claim SOC 2 certification.
Customer-owned execution
Agent logic is packaged, audited, and delivered with protected invocation workflows. Roles and skills execute in the customer environment, not on Trust Agent servers.
Every role is scored 0–100 on the audit pipeline. The score maps to a badge tier shown on marketplace cards and role detail pages so buyers can filter for the trust level their use-case demands.
Zero critical findings. Actively maintained. Recommended for regulated-sector deployment (NHS, finance, legal).
No high-severity findings. Minor issues documented and disclosed. Suitable for most professional use cases.
Minor findings acknowledged with a remediation plan. Use with awareness of documented limitations.
Meets the minimum audit bar. All findings disclosed. Low-stakes applications only; not recommended for regulated or safety-critical work.
Below the audit bar. Use only with explicit understanding of the disclosed risks; not for production.
Data residency
United Kingdom. All primary infrastructure, compute, and storage run on servers we operate in the UK — not on a public cloud provider.
Application hosting
Self-hosted. Trust Agent runs under our own process supervisor behind a hardened reverse proxy. Deployments are scripted, auditable, and zero-downtime.
Database
Self-hosted PostgreSQL with automated backups, point-in-time recovery, and encrypted connections. We do not use Neon, AWS RDS, or any third-party DBaaS.
Sandbox evidence
Docker-based jobs capture commands, network requests, and file activity so buyers can see what was actually observed.
Source drift
If indexed source changes after the audit, the verification posture degrades until a new scan and analyst pass complete.
12 additional MCP checks
Endpoint exposure, transport constraints, undeclared tool bridges, external process escalation, and safety envelope validation.
Protected seller IP
Trust Agent publishes buyer-safe evidence and analyst narratives without exposing raw creator prompts, manifests, or secure role orchestration logic.
GDPR
CompliantEU data protection regulation
SOC 2 Type II
AlignedTrust service criteria alignment
OWASP Top 10
AlignedWeb application security standard
ISO 27001
In progressInformation security management
We take security vulnerabilities seriously and appreciate responsible disclosure from the security community. If you discover a vulnerability, please report it to us so we can address it promptly.
How to report
Email info@trust-agent.ai with a description of the vulnerability, steps to reproduce, and any supporting evidence. We aim to acknowledge reports within 48 hours.
Our commitment
Security contact
info@trust-agent.ai