MCP Security Methodology
AgentScore screens AI agents against observable, publicly available signals. It does not authenticate agent identity. It does not guarantee safety. It provides a structured risk assessment from whatever data is available at the time of the check.
Scope
- Scans published npm packages used in MCP ecosystems.
- Performs static metadata and published source code checks. Downloads npm tarballs and scans up to 4MB of source in memory.
- Does not claim runtime exploit detection.
- Does not authenticate agent identity. Callers can claim any deployer, model, or deployment context.
- Abuse database is community-reported and moderated. Not ground truth.
- Scores are screening heuristics, not security guarantees.
Six Screening Checks
Each check answers a specific question about the agent using publicly observable data. All checks explicitly report what they could and could not determine.
Check 1: Deployer
Who built and runs this agent? How established is their public development profile?
GitHub API: account age, public repositories, contribution history, stars, organisation membership.
GitHub account can be fabricated. Stars can be purchased. Checks public profile only, not private repos. Does not verify legal identity. Self-reported GitHub username is not authenticated.
Check 2: Model
What LLM powers this agent? Is it from a known provider?
Self-reported model name and provider. Checked against a list of known providers (Anthropic, OpenAI, Google, Meta, Mistral, etc).
Entirely self-reported. An agent can claim to be running any model. No behavioural fingerprinting is performed in the current version. This check identifies known providers, not verifies them.
Check 3: Code
Is the agent's source code publicly available and auditable?
GitHub API: repository existence, licence, open source status, commit history, maintenance activity. npm registry: dependency count, install scripts, package metadata.
Checks public repository metadata plus best-effort static analysis of the published npm tarball. Does not execute code, inspect runtime behaviour, or guarantee that build outputs match the reviewed source.
Check 4: Abuse
Has this agent or package been reported for malicious behaviour?
AgentScore community abuse database. Reports submitted by users and operators.
Database only contains reports that have been submitted. Absence of reports does not mean absence of abuse. Reports are not independently verified. May contain false positives.
Check 5: Permissions
What tools does this agent request access to? Does the scope match its stated purpose?
MCP tool call request analysis. Tools classified by risk level (LOW, MEDIUM, HIGH, CRITICAL). Anomaly detection compares requested tools against expected patterns for the agent type.
Risk classification is rule-based (tool name prefix matching). A tool named 'safe_delete' would be classified differently from 'delete_safe'. Anomaly detection uses simple heuristics, not learned patterns.
Check 6: Deployment
How is this agent deployed? Local or remote? Human oversight or autonomous?
Self-reported transport type, origin, human-in-loop flag, orchestration flag, persistence.
Entirely self-reported and cannot be independently verified. A remote autonomous agent can claim to be local with human oversight. This check provides context for risk assessment, not proof of deployment configuration.
Decision Logic
The unified screening endpoint runs all applicable checks in parallel and produces an overall recommendation. The recommendation uses fail-closed logic:
- Any abuse reports: BLOCK
- Permissions risk CRITICAL: BLOCK
- Code risk HIGH or CRITICAL: CAUTION
- Deployment risk HIGH or CRITICAL: CAUTION
- Fewer than 3 checks completed: INSUFFICIENT DATA
- Otherwise: score-based ALLOW or CAUTION
The overall score is an average across completed checks. The recommendation overrides the score when any individual dimension triggers a hard gate. A score of 70 with a CRITICAL permission check still results in BLOCK.
Package Scanner
The MCP package scanner analyses npm package metadata and published source code for security issues. It checks:
- Install scripts (preinstall, postinstall) with network calls or registry modification
- Suspicious URLs (non-localhost raw IPs, known exfiltration domains)
- Prompt injection patterns in package description and keywords
- Missing metadata (no repository, no licence)
- Excessive runtime dependencies
- Source code patterns including command injection, unsafe eval, hardcoded secrets, and sensitive file access
The scanner performs static analysis only. It downloads published npm tarballs and scans them in memory, but it does not execute code, observe runtime behaviour, or inspect network traffic. Install scripts are flagged based on content patterns, not execution. A clean scan does not mean the package is safe. It means no issues were detected in the metadata or scanned source content.
OWASP MCP Top 10 Coverage
The OWASP MCP Top 10 is a public framework defining the most critical security risks for MCP-enabled systems. AgentScore is not affiliated with the OWASP project. The table below is our own coverage map against that framework, published so users can see which risks the scanner addresses and which remain out of scope.
| OWASP Risk | Coverage | How |
|---|---|---|
| MCP01: Token Mismanagement | Covered | Hardcoded secret detection in source, provenance/publisher posture checks |
| MCP02: Privilege Escalation | Partial | Permission analysis via screening checks. Tool-level scope analysis planned. |
| MCP03: Tool Poisoning | Partial | Prompt injection in metadata (15 patterns) + tool extraction with manifest hashing. Direct tool definition analysis being refined. |
| MCP04: Supply Chain Attacks | Covered | Install script detection, dependency monitoring, exposure API, provenance posture |
| MCP05: Command Injection | Covered | Source code analysis for exec/spawn with dynamic input, unsafe eval |
| MCP06: Intent Flow Subversion | Partial | Prompt injection in metadata detected. Runtime intent hijacking out of scope. |
| MCP07: Insufficient Auth | Partial | Publisher posture and trusted publishing checks. Runtime auth assessment planned. |
| MCP08: Lack of Audit | Covered | Adjudication log, scan history, monitoring with change attribution |
| MCP09: Shadow MCP Servers | Partial | Exposure API identifies unmonitored dependencies. Full shadow detection planned. |
| MCP10: Context Injection | Partial | Prompt injection patterns in metadata. Runtime context analysis out of scope. |
Reference: OWASP MCP Top 10. Coverage is based on current scanner version 2.1 and continuous monitoring capabilities.
Benchmark
The scanner is benchmarked against a frozen test corpus of 40 packages: 25 known-clean packages from official sources, 10 suspicious packages, and 5 synthetic malicious patterns.
- False positive rate on high/critical findings (clean packages): 0%
- Detection rate on synthetic malicious patterns: 100% (5/5)
- Tested through the real scanner function path, not re-implemented logic
These results are for the current test corpus only. They are not a general precision/recall claim. The corpus is small and the synthetic cases are designed patterns. Real-world detection rates may differ.
Standards Alignment
AgentScore screening aligns with principles from:
- NIST AI Risk Management Framework: risk identification and measurement
- DSIT AI Assurance Roadmap: third-party assurance for AI systems
- Software Security Code of Practice: supply chain risk identification
AgentScore does not claim compliance with these standards. It aligns with their principles in the specific domain of AI agent screening.