# Agent Safe - Full Documentation for AI Systems

> A Remote MCP Server with 11 security tools (8 paid + 3 free) that protect AI agents from phishing, social engineering, prompt injection, BEC, deepfakes, AI-generated images, and message-based manipulation. Works with emails, images, videos, and any other message format. Pay-per-use via Skyfire Network at $0.01 per unit.

## Overview

Agent Safe is a message and media safety verification service designed specifically for AI agents. Before an agent processes, replies to, or acts on any message — whether it's an email, chat message, DM, image, video, or any other format — it can call Agent Safe to get a safety analysis. The service uses advanced AI to detect multiple categories of threats across all platforms and returns structured, actionable results. Email-specific tools provide extra analysis like sender reputation verification with live DNS lookups. The media authenticity tool uses 4-layer forensic analysis to detect AI-generated images and deepfakes.

Agents are first-class customers. No human signup required with Agent Safe. Your agent just needs a Skyfire Buyer API Key — include it in the `skyfire-api-key` header, and Agent Safe automatically generates PAY tokens and charges per unit ($0.01/unit). Alternatively, your agent can generate its own PAY tokens and send them via the `skyfire-pay-id` header.

## MCP Protocol Connection

Agent Safe implements the Model Context Protocol (MCP) using Streamable HTTP transport.

- **Endpoint:** `POST https://agentsafe.locationledger.com/mcp`
- **Transport:** Streamable HTTP (JSON-RPC 2.0)
- **Protocol Version:** 2025-03-26
- **SDK:** `@modelcontextprotocol/sdk`

### MCP Client Configuration

To connect from any MCP-compatible client (Cursor, Windsurf, etc.):

```json
{
  "mcpServers": {
    "agentsafe": {
      "command": "npx",
      "args": [
        "-y", "mcp-remote",
        "https://agentsafe.locationledger.com/mcp",
        "--header",
        "skyfire-api-key: <YOUR_SKYFIRE_BUYER_API_KEY>"
      ]
    }
  }
}
```

Get your Skyfire Buyer API Key at https://skyfire.xyz — your agent uses this key, and Agent Safe handles PAY token generation automatically.

## The 11 Tools (8 Paid + 3 Free)

### Tool 0: assess_message (FREE — No Authentication Required)

Free triage tool that analyzes whatever context you have about a message and recommends which security tools to run. Uses pure logic (no AI call), responds instantly, and costs nothing. Always call this first. Now includes media recommendations — if image/video URLs are present, it recommends check_media_authenticity.

#### Input Parameters

All parameters are optional — include whatever you have:

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| from | string | No | Sender email address |
| subject | string | No | Message subject line |
| body | string | No | Message body content |
| links | string[] | No | URLs found in the message |
| urls | string[] | No | URLs to check (alternative to links) |
| attachments | object[] | No | Attachment metadata (name, size, mimeType) |
| sender | string | No | Sender identifier for non-email platforms |
| senderDisplayName | string | No | Sender display name |
| platform | string | No | Message platform (sms, whatsapp, slack, discord, telegram, etc.) |
| messages | object[] | No | Thread messages for thread analysis |
| draftTo | string | No | Draft reply recipient |
| draftBody | string | No | Draft reply body |
| media | object[] | No | Media attachments (url, type) — triggers check_media_authenticity recommendation |

#### Output Fields

| Field | Type | Description |
|-------|------|-------------|
| recommendedTools | array | Prioritized list of tools to run, each with name, reason, priority, and estimatedCost |
| skippedTools | array | Tools not recommended, each with name and reason |
| totalEstimatedCost | number | Total cost if all recommended tools are called |
| summary | string | Brief explanation of the triage decision |

#### Example MCP Tool Call

```json
{
  "method": "tools/call",
  "params": {
    "name": "assess_message",
    "arguments": {
      "from": "ceo@company-update.com",
      "subject": "Urgent Wire Transfer",
      "body": "Please process this wire transfer immediately...",
      "links": ["https://suspicious-site.com/login"],
      "attachments": [{"name": "invoice.pdf", "size": 50000, "mimeType": "application/pdf"}]
    }
  }
}
```

#### Example REST API Call (No Authentication Required)

```bash
curl -X POST https://agentsafe.locationledger.com/mcp/tools/assess_message \
  -H "Content-Type: application/json" \
  -d '{
    "from": "ceo@company-update.com",
    "subject": "Urgent Wire Transfer",
    "body": "Please process this wire transfer immediately...",
    "links": ["https://suspicious-site.com/login"],
    "attachments": [{"name": "invoice.pdf", "size": 50000, "mimeType": "application/pdf"}]
  }'
```

### Tool 1: check_email_safety (1 unit — $0.01)

Analyzes incoming emails for phishing, social engineering, prompt injection, CEO fraud, financial fraud, data exfiltration, malware indicators, and impersonation. The core email security tool — 8 threat categories. For non-email messages, use check_message_safety.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| from | string | Yes | Sender email address |
| subject | string | Yes | Email subject line |
| body | string | Yes | Email body content |
| links | string[] | No | URLs found in the email |
| attachments | object[] | No | Attachment metadata (name, size, type) |
| knownSender | boolean | No | Whether the sender is known/trusted |
| previousCorrespondence | boolean | No | Whether there has been previous email exchange |

### Output Fields

| Field | Type | Description |
|-------|------|-------------|
| verdict | string | "safe", "suspicious", or "dangerous" |
| riskScore | number | 0.0 (completely safe) to 1.0 (confirmed malicious) |
| threats | array | List of detected threats, each with type, description, severity, and indicators |
| recommendation | string | "proceed", "proceed_with_caution", or "do_not_act" |
| safeActions | array | Actions the agent can safely take |
| unsafeActions | array | Actions the agent should avoid |
| summary | string | Human-readable summary of the analysis |

### Example MCP Tool Call

```json
{
  "method": "tools/call",
  "params": {
    "name": "check_email_safety",
    "arguments": {
      "sender": "urgent-security@g00gle-support.com",
      "subject": "URGENT: Your account will be suspended",
      "body": "Dear user, your Google account has been compromised. Click here immediately to verify your identity: http://g00gle-secure-login.tk/verify"
    }
  }
}
```

### Example Response

```json
{
  "verdict": "dangerous",
  "riskScore": 0.95,
  "threats": [
    {
      "type": "phishing",
      "severity": "critical",
      "description": "Spoofed Google domain using character substitution (g00gle)",
      "indicators": ["g00gle-support.com", "g00gle-secure-login.tk"]
    },
    {
      "type": "social_engineering",
      "severity": "high",
      "description": "Creates false urgency with account suspension threat",
      "indicators": ["URGENT", "will be suspended", "immediately"]
    }
  ],
  "recommendation": "do_not_act",
  "safeActions": ["Report as phishing", "Delete the message", "Notify user"],
  "unsafeActions": ["Click any links", "Reply to sender", "Forward message"],
  "summary": "This is a phishing attempt using a spoofed Google domain. The message creates false urgency to trick the recipient into clicking a malicious link."
}
```

### Tool 2: check_message_safety (1 unit — $0.01)

Platform-aware security analysis for non-email messages — SMS, WhatsApp, Slack, Discord, Telegram, Instagram DMs, Facebook Messenger, LinkedIn, iMessage, Signal, Microsoft Teams, and more. 8 threat categories with platform-specific context.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| platform | string | Yes | Platform name (e.g., "whatsapp", "slack", "sms", "discord") |
| sender | string | Yes | Sender identifier |
| messages | object[] | Yes | Array of messages (min 1, max 50) with body, direction, timestamp |
| media | object[] | No | Media attachments (url, type) |
| senderVerified | boolean | No | Whether platform has verified the sender |
| contactKnown | boolean | No | Whether sender is a known contact |

### Tool 3: check_url_safety (1 unit — $0.01)

Analyzes up to 20 URLs per call for phishing, malware, typosquatting, redirect abuse, command injection, suspicious tracking, and domain spoofing. 7 threat categories.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| urls | string[] | Yes | Array of URLs to analyze (max 20) |
| context | string | No | Where the URLs were found |

### Tool 4: check_response_safety (1 unit — $0.01)

Scans draft replies BEFORE sending for data leakage, PII exposure, credential disclosure, compliance violations, and social engineering compliance. 5 threat categories.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| draftTo | string | Yes | Draft reply recipient |
| draftSubject | string | Yes | Draft reply subject |
| draftBody | string | Yes | Draft reply body content |
| originalFrom | string | No | Original message sender |
| originalSubject | string | No | Original message subject |
| originalBody | string | No | Original message body |

### Tool 5: analyze_email_thread (1 unit — $0.01)

Analyzes full message threads for escalating social engineering, scope creep, trust exploitation, authority escalation, and deadline manufacturing. 5 threat categories.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| messages | object[] | Yes | Array of messages (min 2, max 50) with from, subject, body, date |
| currentAction | string | No | What the agent is about to do |

### Tool 6: check_attachment_safety (1 unit — $0.01)

Assesses attachment risk before opening/downloading. Detects executable masquerades, double extensions, macro risks, MIME mismatches. 6 threat categories.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| attachments | object[] | Yes | Array with name, mimeType, size (max 20) |
| context | string | No | Context about where attachments came from |

### Tool 7: check_sender_reputation (1 unit — $0.01)

Verifies sender identity with live DNS DMARC lookups, RDAP domain age checks, and AI analysis. 6 threat categories + live DNS enrichment at no extra cost.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| email | string | Yes | Sender email address to verify |
| displayName | string | Yes | Display name shown in client |
| replyTo | string | No | Reply-to address if different |
| emailSubject | string | No | Subject for context |
| emailSnippet | string | No | First ~500 chars of email body for context |

### Tool 8: check_media_authenticity (4 units/image — $0.04, 10 units/video — $0.10)

Detects AI-generated images, deepfakes, and manipulated media using a 4-layer forensic analysis system:

- **Layer 1: Metadata Forensics** (weight: 0.10) — EXIF data extraction, software/AI tool detection, GPS/camera info
- **Layer 2: Error Level Analysis** (weight: 0.15) — JPEG recompression artifact detection, manipulation region identification
- **Layer 3: AI Detection Model** (weight: 0.55) — ML-based AI-generated content detection, deepfake analysis
- **Layer 4: Noise Pattern Analysis** (weight: 0.20) — Frequency domain analysis, GAN artifact detection, texture consistency

Video analysis uses the AI Detection Model layer only (single layer, 100% weight).

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| mediaUrl | string | Yes | URL of the image or video to analyze |
| mediaType | string | No | "image" or "video" — auto-detected from URL if omitted |
| context | string | No | Where the media was found (e.g., "email attachment", "social media post") |

#### Output Fields

| Field | Type | Description |
|-------|------|-------------|
| verdict | string | "authentic", "likely_authentic", "inconclusive", "likely_ai_generated", or "ai_generated" |
| confidenceScore | number | 0.0 to 1.0 confidence in the verdict |
| overallAssessment | string | Human-readable summary of the analysis |
| analysisLayers | array | Results from each forensic layer with name, score, weight, and findings |
| threatIndicators | array | Specific indicators of manipulation or AI generation |
| recommendation | string | Recommended action based on the analysis |

### Tool 9: check_prompt_injection_db (FREE — No Authentication Required)

Free tool to query crowdsourced prompt injection sightings collected from live monitoring of AI agent social networks. Filter by injection type, timeframe, and text search. Data is continuously sourced from automated scanning of agent platforms like Moltbook. No charge, no authentication required.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| injectionType | string | No | Filter by type: ignore_instructions, system_override, encoded_payload, social_engineering, data_exfiltration |
| timeframe | string | No | Time window to search (e.g. 24h, 7d, 30d). Default: 30d |
| search | string | No | Text search across payloads, authors, and types |

#### Output Fields

| Field | Type | Description |
|-------|------|-------------|
| results | array | List of matching sightings with injectionType, payloadExcerpt, sourceAuthor, sourcePlatform, confidence, spottedAt |
| summary | string | Human-readable summary of the query results |
| totalInjections | number | Total number of matching sightings |
| totalScanned | number | Total posts scanned to find these sightings |
| typeDescriptions | object | Descriptions of each injection type |
| timeframe | string | The time window that was searched |
| charged | number | Always 0 — completely free |

### Tool 10: submit_feedback (FREE — No Authentication Required)

Free tool for agents to rate any analysis and help improve detection quality. After using any paid tool, call submit_feedback to report whether the analysis was helpful, inaccurate, missed a threat, or was a false positive. No charge, no authentication required.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| rating | enum | Yes | Your rating: helpful, not_helpful, inaccurate, missed_threat, or false_positive |
| comment | string | No | Optional details about your experience |
| checkId | string | No | The checkId returned by the tool you're rating |
| toolName | string | No | Which tool you're giving feedback on (e.g. check_email_safety) |
| agentPlatform | string | No | Your agent platform (e.g. claude, cursor, openai) |

#### Output Fields

| Field | Type | Description |
|-------|------|-------------|
| received | boolean | Whether feedback was recorded |
| message | string | Confirmation message |
| charged | number | Always 0 — completely free |

## Pricing

Agent Safe uses a unit-based pricing model at $0.01 per unit:

| Tool | Units | Cost |
|------|-------|------|
| assess_message | 0 | FREE |
| check_prompt_injection_db | 0 | FREE |
| submit_feedback | 0 | FREE |
| check_email_safety | 1 | $0.01 |
| check_message_safety | 1 | $0.01 |
| check_url_safety | 1 | $0.01 |
| check_response_safety | 1 | $0.01 |
| analyze_email_thread | 1 | $0.01 |
| check_attachment_safety | 1 | $0.01 |
| check_sender_reputation | 1 | $0.01 |
| check_media_authenticity (image) | 4 | $0.04 |
| check_media_authenticity (video) | 10 | $0.10 |

## REST API (Alternative to MCP)

For agents that don't support MCP, a REST API is also available:

### Service Discovery
```
GET https://agentsafe.locationledger.com/mcp/discover
```

Returns all 11 tools, capabilities, pricing, and connection instructions.

### REST Endpoints

| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | /mcp/tools/assess_message | Free triage — recommends which tools to call |
| POST | /mcp/tools/check_email_safety | Email safety check (1 unit) |
| POST | /mcp/tools/check_message_safety | Message safety — any platform (1 unit) |
| POST | /mcp/tools/check_url_safety | URL safety check (1 unit) |
| POST | /mcp/tools/check_response_safety | Response/reply safety check (1 unit) |
| POST | /mcp/tools/analyze_email_thread | Thread analysis (1 unit) |
| POST | /mcp/tools/check_attachment_safety | Attachment safety check (1 unit) |
| POST | /mcp/tools/check_sender_reputation | Sender reputation check (1 unit) |
| POST | /mcp/tools/check_media_authenticity | Media authenticity/deepfake detection (4 units/image, 10 units/video) |
| POST | /mcp/tools/check_prompt_injection_db | Free — query known prompt injection patterns |
| POST | /mcp/tools/submit_feedback | Free — submit feedback on any analysis |

All REST endpoints accept JSON. Paid tools require a `skyfire-api-key` header (Skyfire Buyer API Key). Free tools (assess_message, check_prompt_injection_db, submit_feedback) require no authentication. Legacy `skyfire-pay-id` header with PAY tokens is also supported for paid tools.

## Payment

Agent Safe uses the Skyfire Network for pay-per-use billing:

- **Pricing model:** Unit-based at $0.01 USD per unit
- **Text tools:** 1 unit ($0.01) per call
- **Image analysis:** 4 units ($0.04) per image
- **Video analysis:** 10 units ($0.10) per video
- **Free tools:** assess_message, check_prompt_injection_db, and submit_feedback — no charge, no auth needed
- **Payment method:** Skyfire Buyer API Key sent via `skyfire-api-key` HTTP header (recommended). Agent Safe auto-generates PAY tokens server-side.
- **No signup required:** Just get a Skyfire Buyer API Key and start using the service
- **Get started:** Visit https://skyfire.xyz to get your Buyer API Key

## Discovery Endpoints

| URL | Purpose |
|-----|---------|
| https://agentsafe.locationledger.com/mcp | MCP Streamable HTTP endpoint |
| https://agentsafe.locationledger.com/mcp/discover | REST service discovery |
| https://agentsafe.locationledger.com/.well-known/mcp.json | MCP auto-discovery manifest |
| https://agentsafe.locationledger.com/.well-known/ai-plugin.json | AI plugin manifest |
| https://agentsafe.locationledger.com/llms.txt | AI-readable summary |
| https://agentsafe.locationledger.com/llms-full.txt | Full AI-readable documentation (this file) |

## Registry Listings

- **Official MCP Registry:** https://registry.modelcontextprotocol.io/v0.1/servers/io.github.wowcool%2Fagentsafe/versions/1.0.0
- **Smithery:** https://smithery.ai/server/agent-safe-email/agentsafeemail
- **Skyfire Marketplace:** Listed as approved seller
- **GitHub:** https://github.com/wowcool/Agent-Safe-MCP

## Legal

- **Terms of Service:** https://agentsafe.locationledger.com/terms
- **License:** Proprietary - All Rights Reserved
- **Company:** Alibi Ledger, LLC
- **Contact:** support@locationledger.com
- **Website:** https://locationledger.com

## Liability

Agent Safe provides threat analysis to assist AI agent decision-making. Results are advisory and should not be the sole basis for security decisions. The service works with any message format and media — emails get extra tools like sender reputation verification, images and videos get forensic authenticity analysis, and any message can be analyzed for threats. See Terms of Service for full liability limitations.