MCP Security Methodology

AgentScore screens AI agents against observable, publicly available signals. It does not authenticate agent identity. It does not guarantee safety. It provides a structured risk assessment from whatever data is available at the time of the check.

Scope

  • Scans published npm packages used in MCP ecosystems.
  • Performs static metadata and published source code checks. Downloads npm tarballs and scans up to 4MB of source in memory.
  • Does not claim runtime exploit detection.
  • Does not authenticate agent identity. Callers can claim any deployer, model, or deployment context.
  • Abuse database is community-reported and moderated. Not ground truth.
  • Scores are screening heuristics, not security guarantees.

Six Screening Checks

Each check answers a specific question about the agent using publicly observable data. All checks explicitly report what they could and could not determine.

Check 1: Deployer

Who built and runs this agent? How established is their public development profile?

Sources

GitHub API: account age, public repositories, contribution history, stars, organisation membership.

Limitations

GitHub account can be fabricated. Stars can be purchased. Checks public profile only, not private repos. Does not verify legal identity. Self-reported GitHub username is not authenticated.

Check 2: Model

What LLM powers this agent? Is it from a known provider?

Sources

Self-reported model name and provider. Checked against a list of known providers (Anthropic, OpenAI, Google, Meta, Mistral, etc).

Limitations

Entirely self-reported. An agent can claim to be running any model. No behavioural fingerprinting is performed in the current version. This check identifies known providers, not verifies them.

Check 3: Code

Is the agent's source code publicly available and auditable?

Sources

GitHub API: repository existence, licence, open source status, commit history, maintenance activity. npm registry: dependency count, install scripts, package metadata.

Limitations

Checks public repository metadata plus best-effort static analysis of the published npm tarball. Does not execute code, inspect runtime behaviour, or guarantee that build outputs match the reviewed source.

Check 4: Abuse

Has this agent or package been reported for malicious behaviour?

Sources

AgentScore community abuse database. Reports submitted by users and operators.

Limitations

Database only contains reports that have been submitted. Absence of reports does not mean absence of abuse. Reports are not independently verified. May contain false positives.

Check 5: Permissions

What tools does this agent request access to? Does the scope match its stated purpose?

Sources

MCP tool call request analysis. Tools classified by risk level (LOW, MEDIUM, HIGH, CRITICAL). Anomaly detection compares requested tools against expected patterns for the agent type.

Limitations

Risk classification is rule-based (tool name prefix matching). A tool named 'safe_delete' would be classified differently from 'delete_safe'. Anomaly detection uses simple heuristics, not learned patterns.

Check 6: Deployment

How is this agent deployed? Local or remote? Human oversight or autonomous?

Sources

Self-reported transport type, origin, human-in-loop flag, orchestration flag, persistence.

Limitations

Entirely self-reported and cannot be independently verified. A remote autonomous agent can claim to be local with human oversight. This check provides context for risk assessment, not proof of deployment configuration.

Decision Logic

The unified screening endpoint runs all applicable checks in parallel and produces an overall recommendation. The recommendation uses fail-closed logic:

  • Any abuse reports: BLOCK
  • Permissions risk CRITICAL: BLOCK
  • Code risk HIGH or CRITICAL: CAUTION
  • Deployment risk HIGH or CRITICAL: CAUTION
  • Fewer than 3 checks completed: INSUFFICIENT DATA
  • Otherwise: score-based ALLOW or CAUTION

The overall score is an average across completed checks. The recommendation overrides the score when any individual dimension triggers a hard gate. A score of 70 with a CRITICAL permission check still results in BLOCK.

Package Scanner

The MCP package scanner analyses npm package metadata and published source code for security issues. It checks:

  • Install scripts (preinstall, postinstall) with network calls or registry modification
  • Suspicious URLs (non-localhost raw IPs, known exfiltration domains)
  • Prompt injection patterns in package description and keywords
  • Missing metadata (no repository, no licence)
  • Excessive runtime dependencies
  • Source code patterns including command injection, unsafe eval, hardcoded secrets, and sensitive file access

The scanner performs static analysis only. It downloads published npm tarballs and scans them in memory, but it does not execute code, observe runtime behaviour, or inspect network traffic. Install scripts are flagged based on content patterns, not execution. A clean scan does not mean the package is safe. It means no issues were detected in the metadata or scanned source content.

OWASP MCP Top 10 Coverage

The OWASP MCP Top 10 is a public framework defining the most critical security risks for MCP-enabled systems. AgentScore is not affiliated with the OWASP project. The table below is our own coverage map against that framework, published so users can see which risks the scanner addresses and which remain out of scope.

OWASP RiskCoverageHow
MCP01: Token MismanagementCoveredHardcoded secret detection in source, provenance/publisher posture checks
MCP02: Privilege EscalationPartialPermission analysis via screening checks. Tool-level scope analysis planned.
MCP03: Tool PoisoningPartialPrompt injection in metadata (15 patterns) + tool extraction with manifest hashing. Direct tool definition analysis being refined.
MCP04: Supply Chain AttacksCoveredInstall script detection, dependency monitoring, exposure API, provenance posture
MCP05: Command InjectionCoveredSource code analysis for exec/spawn with dynamic input, unsafe eval
MCP06: Intent Flow SubversionPartialPrompt injection in metadata detected. Runtime intent hijacking out of scope.
MCP07: Insufficient AuthPartialPublisher posture and trusted publishing checks. Runtime auth assessment planned.
MCP08: Lack of AuditCoveredAdjudication log, scan history, monitoring with change attribution
MCP09: Shadow MCP ServersPartialExposure API identifies unmonitored dependencies. Full shadow detection planned.
MCP10: Context InjectionPartialPrompt injection patterns in metadata. Runtime context analysis out of scope.

Reference: OWASP MCP Top 10. Coverage is based on current scanner version 2.1 and continuous monitoring capabilities.

Benchmark

The scanner is benchmarked against a frozen test corpus of 40 packages: 25 known-clean packages from official sources, 10 suspicious packages, and 5 synthetic malicious patterns.

  • False positive rate on high/critical findings (clean packages): 0%
  • Detection rate on synthetic malicious patterns: 100% (5/5)
  • Tested through the real scanner function path, not re-implemented logic

These results are for the current test corpus only. They are not a general precision/recall claim. The corpus is small and the synthetic cases are designed patterns. Real-world detection rates may differ.

Standards Alignment

AgentScore screening aligns with principles from:

  • NIST AI Risk Management Framework: risk identification and measurement
  • DSIT AI Assurance Roadmap: third-party assurance for AI systems
  • Software Security Code of Practice: supply chain risk identification

AgentScore does not claim compliance with these standards. It aligns with their principles in the specific domain of AI agent screening.