OWASP AI Vulnerability Scoring System (AIVSS) Framework v0.8 vs AI SAFE²/AISM: Framework Assessment 2026

OWASP AI Vulnerability Scoring System (AIVSS) Framework v0.8 vs AI SAFE²/AISM: Framework Assessment 2026

Context: On March 19, 2026, OWASP released the AI Vulnerability Scoring System (AIVSS) v0.8. Within hours, a practitioner posed four foundational questions about its rigor. This report applies those same four evaluative standards of AI systems such as skill definition, execution model, threat model, and mapping logic to both AIVSS v0.8 and the CSI AI SAFE²/AISM framework. The goal is not to dismiss either effort. It is to hold both to the same standards the community deserves.

Executive Summary

OWASP AI Vulnerability Scoring System (AIVSS) Framework v0.8 is a co-publication with AIUC-1, the OWASP AI Exchange, and the OWASP Citizen Development Top 10. It incorporated over 1,900 public comments and introduced agentic “force multipliers” autonomy, tool use, dynamic identity, and persistent memory as scoring amplifiers on a CVSS v4.0 baseline.

A LinkedIn practitioner raised four questions that any defensible security framework must answer before it earns trust: What skill definition is used? What execution model is assumed? What threat model is applied? What mapping logic produced the final categories?

These are not advanced questions. They are first principles.

This report applies those exact criteria to both AIVSS framework v0.8 and the CSI AI SAFE²/AISM framework, scores each, and delivers a graded comparative analysis. The findings: AIVSS v0.8 is structurally important but leaves critical documentation gaps. AI SAFE²/AISM provides explicit, code-level answers to all four questions and newly published resources including a formal Agent Threat and Control Matrix and Compliance Crosswalk close gaps identified in earlier assessments.

If governance is not enforced at runtime, it is not governance. It is forensics.

The Four Evaluative Criteria

A framework that cannot explain these four elements cannot be reproducibly applied, independently audited, or safely challenged.

Criterion What It Requires Why It Is Non-Negotiable
Skill Definition An explicit definition of what “skill” or capability means in the agentic context Without it, the scope of protection is undefined
Execution Model A defined model of how agents execute, decide, and act Risk categories must reflect how threats manifest in running systems
Threat Model The adversary model, attack surfaces, and trust boundaries Without a declared threat model, coverage gaps are invisible
Mapping Logic The methodology that maps real-world risks to framework categories Without it, users cannot challenge, extend, or update the framework

You cannot audit a millisecond with a weekly meeting. The standard for any framework governing autonomous agents must be that it answers these questions on demand not because it was designed to game a checklist, but because it was designed from first principles.

OWASP AIVSS v0.8 – What It Says, What It Doesn’t

What AIVSS v0.8 Gets Right

AIVSS v0.8 is an ambitious undertaking with real structural value. Released March 19, 2026 ahead of RSAC 2026, it extends CVSS v4.0 with agentic risk amplification factors: autonomy level, tool use scope, multi-agent interactions, non-determinism, and capacity for self-modification. These amplifiers produce a contextual score from 0 to 10 that reflects the degree to which an agent’s capabilities amplify a base vulnerability.

The release includes a refined quantitative model, real-world attack scenarios, crosswalks to the OWASP Agentic AI Top 10 for 2026CSA MAESTRO, AIUC-1, and NIST AI RMF, plus empirical Appendix D documenting expert contributor survey findings. Its Distinguished Review Board includes former NSA Cybersecurity Director Rob Joyce, Anthropic Deputy CISO Jason Clinton, NIST’s Apostol Vassilev, and Harvard CISO Michael Tran Duff. Institutional credibility matters.

The AIVSS v0.8 Top 10 Core Risks

  1. Agentic AI Tool Misuse — Insecure invocation, compromised tool usage, output misinterpretation
  2. Agent Access Control Violation — Permission escalation, role inheritance exploitation, temporal permission drift
  3. Agent Data Exfiltration — Memory-based data leakage, cross-system data access via trusted channels
  4. Agent Memory & Context Manipulation — Poisoning of persistent memory and RAG stores
  5. Multi-Agent Orchestration Attacks — Inter-agent communication exploitation, control-flow hijacking
  6. Agent Goal Manipulation — Objective redirection via prompt injection or poisoned external content
  7. Context Amnesia Exploitation — Manipulation of an agent’s context maintenance across sessions
  8. Agent Supply Chain & Dependency Attacks — Compromised tools, plugins, MCP servers, or model providers
  9. Agent Untraceability — Lack of forensic attribution across agent actions
  10. Agent Goal and Instruction Manipulation — Compound redirection combining goal hijack with instruction-layer attacks

Grading AIVSS v0.8 Against the Four Criteria

Skill Definition: Partial (6/10) — The document defines Agentic AI at the system level but not as discrete skills. It implies skill = tool invocation capability without formally defining the skill abstraction layer. No formal skill taxonomy exists. No distinction between atomic skills, composite skills, and delegated skills.

Execution Model: Partial (5/10) — AIVSS references a perception → decision-making → action execution → memory cycle. However, there is no formal execution model diagram, state machine, or declared model of agent planning loops that a threat analyst could use to systematically enumerate attack surfaces. Two practitioners analyzing the same agentic system may produce structurally different assessments. This undermines reproducibility.

Threat Model: Moderate (6/10) — The framework maps indirectly against CSA MAESTRO and the OWASP Agentic Threats document. But it does not explicitly declare: Who is the adversary? What are the trust boundaries? What is outside scope?

Mapping Logic: Moderate (6/10) — Category selection emerges from expert collaboration, 1,900+ public comments, and empirical surveys. The logic is community-driven and empirically informed but not formally derived from an underlying model. A practitioner who disagrees with a category boundary has no formal process to challenge it.

AIVSS v0.8 Summary Grade

Criterion Score Status
Skill Definition 6/10 Partial — system-level definition, no formal skill taxonomy
Execution Model 5/10 Partial — capability attributes described, no formal architecture
Threat Model 6/10 Moderate — imports MAESTRO/NIST indirectly, no explicit adversary profile
Mapping Logic 6/10 Moderate — empirically informed, not formally derived
Overall Rigor 5.75/10 Pre-v1.0; structural gaps are addressable but currently unresolved

Contextual Note: AIVSS v0.8 is explicitly pre-release, with v1.0 review opening April 16, 2026. These gaps are not indictments of bad faith, they are the documented work remaining.

AI SAFE² and AISM — Graded Against the Same Four Criteria

Framework Overview

AI SAFE² (Secure AI Framework for Enterprise Ecosystems) is CSI’s practitioner-oriented framework for agentic AI security, operating across five pillars as a continuous defense loop:

  1. Sanitize & Isolate — Input validation, PII filtering, boundary enforcement at every agent boundary
  2. Audit & Inventory — Live mapping of machine identities, vector stores, agent chains, and credentials
  3. Fail-Safe & Recovery — Circuit breakers, automated credential rotation, kill-switch bots, just-in-time access
  4. Engage & Monitor — Runtime behavioral analytics, anomaly detection, vector DB drift monitoring
  5. Evolve & Educate — Red-team adversarial exercises (A2A, RAG leakage), continuous developer training

AISM (AI Sovereignty Maturity Model) sits above AI SAFE², defining five levels (Chaos → Visibility → Governance → Control → Sovereignty), five pillars (Shield, Ledger, Circuit Breaker, Command Center, Learning Engine), and 128 controls in a Sovereignty Matrix.

Gap Closures Since Initial Assessment

Evaluation of additional published resources reveals that several gaps identified in the initial assessment have been materially addressed:

1. Agent Threat and Control Matrix (Gap Closure: Standalone Threat Model) — The AISM Agent Threat and Control Matrix provides exactly what was previously missing: a formal threat model document with 10 explicit threat categories, specific attack vectors per category, MITRE ATLAS technique cross-references, OWASP LLM vulnerability mappings, and control-to-threat traceability across all five AISM pillars. This is not a theoretical document. It maps adversary techniques to engineering controls with the precision a red team can operationalize.

2. Compliance Crosswalk (Gap Closure: Formal Compliance Mapping) — The AISM Compliance Crosswalk maps AISM controls to NIST AI RMF, ISO 42001, OWASP LLM, and MITRE ATLAS. This directly addresses the enterprise procurement gap that required formal crosswalk tables.

3. Self-Assessment Tool — An AISM Self-Assessment Tool now exists for verifying control implementation against threat categories closing the gap identified when comparing against Microsoft RAI MM’s assessment rubrics.

4. Operational Loop and Control Stack — Formalized documents (operational-loop.mdcontrol-stack.md) further externalize the execution model that was previously embedded in the architecture.

AI SAFE²/AISM Against the Four Criteria

Skill Definition: Strong (8/10) — AISM defines “skill” operationally through the Shield pillar, mandating that every agent boundary receive input validation and that agent capabilities be enumerated in a live inventory. The Developer Implementation Guide includes an InputValidator class enforcing schema-level skill boundary constraints. Remaining gap: A standalone definitions document rather than one embedded in the control architecture.

Execution Model: Strong (8.5/10, upgraded from 8/10) — The core design principle “Probabilistic intelligence requires deterministic control” is itself an execution model thesis. The Operational Defense Loop and the now-published operational-loop.md and control-stack.md documents describe formal execution architecture for governance during agent runtime, with the Control Stack mapping from Policy Layer through the five AISM pillars to Agent Platform and Infrastructure layers. Remaining gap: Internal agent cognition architecture (planner-executor-verifier chains) still requires supplementation.

Threat Model: Strong (9/10, upgraded from 8/10) — The Agent Threat and Control Matrix is the critical upgrade. It provides 10 explicit threat categories (Prompt Injection, Multi-Agent Exploitation, Memory Poisoning, Supply Chain Compromise, Non-Human Identity Abuse, Runaway Autonomy, Data Exfiltration, Model Extraction, Adversarial Inputs, Insider Threats) with specific attack vectors, MITRE ATLAS technique cross-references, and OWASP LLM mappings per category. This is a standalone threat model document that precisely identifies what is missing. Remaining gap: Formal STRIDE/PASTA mapping and trust boundary diagrams would achieve full marks.

Mapping Logic: Strong (8.5/10, upgraded from 8/10) — The Compliance Crosswalk maps AISM pillars and controls to external regulatory framework requirements. Combined with the Sovereignty Matrix’s 128 controls organized by pillar and maturity level, practitioners now have formal mapping structures. Remaining gap: EU AI Act article-level crosswalk still in development.

AI SAFE²/AISM Updated Summary Grade

CriterionOriginal ScoreUpdated ScoreStatus
Skill Definition8/108/10Strong — operational, architecture-embedded; standalone taxonomy doc still needed
Execution Model8/108.5/10Strong — Operational Defense Loop + published control-stack and operational-loop docs
Threat Model8/109/10Very Strong — Agent Threat Control Matrix provides explicit adversary profiles + MITRE/OWASP mappings
Mapping Logic8/108.5/10Strong — Compliance Crosswalk now published; EU AI Act mapping in progress
Overall Rigor8.0/108.5/10Materially strengthened by published threat model and compliance crosswalk

Head-to-Head Comparison 

AI SAFE² Coverage of AIVSS v0.8 Top 10 Risks

The heatmap below maps each AIVSS v0.8 core risk to the AI SAFE² pillars, rating coverage as Full, Partial, or None.

Key Finding: AI SAFE² provides full coverage on 9 of 10 AIVSS v0.8 core risks and partial coverage on 1 (Agent Untraceability, addressed through Audit & Inventory and Engage & Monitor but lacking cryptographic non-repudiation at the current documentation level).

Pillar Coverage Depth Per Risk

AI SAFE2 Pillar Coverage Per AIVSS Risks (5 Pillars)

The Four-Criteria Comparative Scorecard (Updated)

CriterionAIVSS v0.8AI SAFE²/AISM (Updated)Difference
Skill Definition6/108/10AISM +2
Execution Model5/108.5/10AISM +3.5
Threat Model6/109/10AISM +3
Mapping Logic6/108.5/10AISM +2.5
Overall5.75/108.5/10AISM +2.75
AIVSS v0.8 vs AI SAFE2/AISM

Structural Differentiators

DimensionOWASP AIVSS v0.8AI SAFE²/AISM
Primary Artifact TypeVulnerability scoring systemRuntime enforcement framework + maturity model
Threat ApproachEnumeration + severity scoringPrevention + detection + response loop
Developer DeliverableScore calculator, assessment templatesProduction Python patterns, SecureAgent class
Skill DefinitionImplicit (tool invocation = skill)Operational — schema-enforced at boundaries
Execution ModelCapability attributesControl Stack + Operational Defense Loop
Threat ModelImports MAESTRO/NIST indirectlyAgent Threat Control Matrix with MITRE ATLAS/OWASP mappings
Kill Switch MechanismNot specifiedCircuit Breaker pillar — explicit architecture
Sovereignty ModelNot presentFull Sovereignty Matrix — human-AI authority continuum

Bar chart comparing GRC Strength, Actual Security, and Agentic AI Readiness scores across six major AI governance frameworks

AI Governance Compostite Scores (0-10)

Horizontal bar chart of AI Governance Composite Scores showing CSI AISM at 9.1 leading all frameworks.

DimensionAI SAFE²/AISM ScoreEvidence
GRC Strength8.5/10Sovereignty Matrix + Ledger pillar; compliance crosswalk now published
Liberty & Freedom9.5/10MIT + CC-BY-SA; sovereignty-as-architecture; no standards body dependency
Actual Security (Runtime)9.0/10Shield, Circuit Breaker, Command Center — deterministic controls
Engineer Actionability9.0/10Production Python patterns: InputValidator, CircuitBreaker, SecureAgent class
Safety Controls9.0/10Kill switches, rate limiting, recursion limits, HITL approval chains
Agentic AI Readiness9.5/10Memory governance, recursion limits, semantic isolation, multi-agent sovereignty
Composite Score9.1/10Highest among six major frameworks evaluated

Combined Framework Utility — AIVSS + AI SAFE²/AISM

These frameworks are not competitors. They operate at different layers of the AI security stack and complement each other in production.

Policy is just intent. Engineering is reality.

LayerAIVSS v0.8 RoleAI SAFE²/AISM Role
Risk IdentificationEnumerates the 10 core agentic risk categoriesProvides runtime controls that prevent or detect those risks
Risk QuantificationScores severity 0–10 using agentic amplification factorsMaturity level assessment across 128 controls
Risk CommunicationCommon vocabulary for practitioners and auditorsOperational Defense Loop for engineers and architects
Risk MitigationIdentifies what needs to be addressedSpecifies how to address it in production code
Risk GovernanceFramework-level compliance mappingSovereignty Matrix for human-AI authority governance

The optimal practitioner stack for 2026 agentic AI deployment:

  1. Use AIVSS v0.8 to score and prioritize agentic risks in your specific deployment
  2. Use AI SAFE² pillar controls to implement mitigations for each scored risk
  3. Use AISM maturity levels to measure and benchmark governance progress
  4. Use NIST AI RMF for federal/regulatory alignment and GRC documentation
  5. Use CSA AICM for formal compliance mapping across 243 control objectives

The Agentic AI Risk Practitioner Challenge — Was It Fair?

The four questions were fair, foundational, and applicable to every security framework including AI SAFE²/AISM.

AIVSS v0.8 partially answers all four in its documentation but does not surface those answers accessibly. The broader point stands: if the most basic practitioner questions about a framework’s foundations cannot be answered directly and accessibly, that is a communication and documentation failure regardless of the underlying rigor.

AIVSS v0.8 is pre-release by design. The current gaps are the work that community review is designed to address. The April 16, 2026 review period is where the community should engage.

AI SAFE²/AISM earns stronger scores not because its documentation is complete it also has work to do, but because its architectural decisions are the answers. The Control Stack is the execution model. The Operational Defense Loop is the threat response architecture. The Shield pillar enforces the skill boundary. The Sovereignty Matrix is the mapping logic.

Never build an engine you cannot kill. The frameworks the community needs are ones that pass the four-question test on demand. The standard should be nothing less.

Frequently Asked Questions (FAQ): AI Security, OWASP AI Vulnerability Scoring System (AIVSS) v0.8, AI SAFE², AISM,  & AI Risk

1. What is OWASP AI Vulnerability Scoring System (AIVSS) v0.8?

AIVSS v0.8 is the AI Vulnerability Scoring System released by OWASP on March 19, 2026. It extends CVSS v4.0 with agentic risk amplification factors autonomy, tool use, dynamic identity, persistent memory, and self-modification producing a contextual severity score from 0 to 10 for agentic AI vulnerabilities. It also identifies 10 core agentic AI security risks.

2. What is AI SAFE² and how does it differ from AIVSS?

AI SAFE² is CSI’s runtime enforcement framework for agentic AI security, organized across five pillars: Sanitize & Isolate, Audit & Inventory, Fail-Safe & Recovery, Engage & Monitor, and Evolve & Educate. Where AIVSS scores and prioritizes risks, AI SAFE² specifies how to prevent, detect, and respond to those risks in production code. AIVSS tells you what is dangerous. AI SAFE² tells you how to stop it.

3. What is the AI Sovereignty Maturity Model (AISM)?

AISM is the maturity layer that sits above AI SAFE², defining five governance levels (Chaos → Visibility → Governance → Control → Sovereignty), five pillars (Shield, Ledger, Circuit Breaker, Command Center, Learning Engine), and 128 controls in a Sovereignty Matrix. It is the only framework that formally models the continuum from human-controlled to fully autonomous AI operations as a measurable governance construct.

4. What were the four practitioner questions that prompted this analysis of security risks?

A LinkedIn practitioner asked: (1) What skill definition does the framework use? (2) What execution model is assumed? (3) What threat model is applied? (4) What mapping logic produced the final categories? These are foundational questions that any defensible security framework must answer before practitioners can trust or reproduce its outputs.

5. How did OWASP AI Vulnerability Scoring System (AIVSS) Framework v0.8 score on the four criteria?

AIVSS v0.8 scored 5.75/10 overall: Skill Definition 6/10, Execution Model 5/10, Threat Model 6/10, Mapping Logic 6/10. The framework provides partial answers to all four questions but leaves critical documentation gaps. This is consistent with its pre-release status, with v1.0 review opening April 16, 2026.

6. How did AI SAFE²/AISM score on the same criteria?

After incorporating newly published resources (Agent Threat Control Matrix, Compliance Crosswalk, Self-Assessment Tool), AI SAFE²/AISM scored 8.5/10 overall: Skill Definition 8/10, Execution Model 8.5/10, Threat Model 9/10, Mapping Logic 8.5/10. The primary gaps remaining are documentation-level (standalone skill taxonomy, EU AI Act crosswalk) rather than architectural.

7. What new resources changed the AISM scores?

Three published documents materially strengthened the AISM assessment: (1) The Agent Threat and Control Matrix, providing 10 threat categories with explicit attack vectors and MITRE ATLAS/OWASP mappings this raised the Threat Model score from 8 to 9. (2) The Compliance Crosswalk, mapping AISM controls to NIST, ISO 42001, OWASP, and MITRE. (3) The Self-Assessment Tool for verifying control implementation.

8. Does AI SAFE² cover all 10 AIVSS core risks?

AI SAFE² provides full coverage on 9 of 10 AIVSS v0.8 core risks across its five pillars. The one partial coverage is Agent Untraceability, which is addressed through the Audit & Inventory and Engage & Monitor pillars but lacks cryptographic non-repudiation at the current documentation level.

9. What is a kill switch in the context of agentic AI, and why does it matter?

A kill switch is a deterministic mechanism that can halt agent operations immediately when they exceed safe boundaries. AISM’s Circuit Breaker pillar provides this as an architectural requirement including rate limiting, recursion limits, and safe-mode activation. No other major framework (NIST AI RMF, ISO 42001, CSA AICM, Microsoft RAI MM, Google SAIF) defines a kill switch mechanism for agentic AI as an architectural construct. Never build an engine you cannot kill.

10. How does the AI Governance Maturity Model compare six frameworks?

The CSI AI Governance Maturity Model evaluates CSI AISM, Google SAIF, CSA AICM, NIST AI RMF, Microsoft RAI MM, and ISO 42001 across six dimensions: GRC Strength, Liberty & Freedom, Actual Security, Engineer Actionability, Safety Controls, and Agentic AI Readiness. AISM scored 9.1/10 composite the highest among all evaluated frameworks.

11. Can AIVSS and AI SAFE² be used together?

They should be. These frameworks operate at different layers of the AI security stack and complement each other. Use AIVSS to score and prioritize risks. Use AI SAFE² pillar controls to implement mitigations. Use AISM maturity levels to benchmark governance progress. Use NIST AI RMF for regulatory alignment. The optimal stack uses all of them in coordination.

12. What does the Sovereignty Matrix measure?

The Sovereignty Matrix models the continuum from human-controlled to fully autonomous AI operations as a formal governance construct. It treats the question of who has authority the human or the agent as measurable and governable rather than aspirational. It defines five maturity levels with specific control requirements at each level, culminating in cryptographic identity verification, immutable ledgers, and continuous adversarial testing at the Sovereignty level.

13. Is AI SAFE² open source?

Yes. AI SAFE² is dual-licensed MIT + CC-BY-SA. The code (MCP Server scripts, JSON schemas, Python patterns) is MIT-licensed for unrestricted commercial use. The framework methodology and documentation are CC-BY-SA, meaning organizations can fork, modify, and implement it without permission, payment, or membership. The full framework is available on GitHub.

14. What compliance standards does AISM map to?

AISM maps to NIST AI RMF (100%), ISO 42001 (100%), OWASP LLM Top 10 (100%), MITRE ATLAS (98%), MIT AI Risk Repository (100%), Google SAIF (95%), SOC 2 Type II, ISO 27001:2022, NIST CSF, HIPAA, and GDPR. The published Compliance Crosswalk provides formal mapping documentation.

15. What improvements does AIVSS v0.8 need before v1.0?

Four specific deliverables: (1) A formal skill taxonomy distinguishing atomic, composite, and delegated skills. (2) A formal execution model diagram or state machine for agent planning and tool-chaining. (3) An explicit adversary profile and trust boundary specification. (4) A formal derivation methodology for category selection. These should be surfaced at the top of the documentation before any category list.

16. What improvements should AI SAFE²/AISM prioritize?

Three enhancements would close remaining gaps: (1) A standalone formal skill taxonomy document published separately from the control architecture. (2) EU AI Act article-level compliance crosswalk for European enterprise procurement. (3) Formal STRIDE/PASTA threat modeling diagrams and trust boundary specifications to supplement the Agent Threat Control Matrix.

17. Where can I get started with implementing these frameworks?

For AIVSS: Engage with the OWASP community review opening April 16, 2026, and use the scoring calculator with the v0.8 documentation. For AI SAFE²/AISM: Start with the GitHub repository, review the Self-Assessment Tool, and implement the five pillar controls using the production Python patterns in the Developer Implementation Guide. The AI SAFE² Implementation Toolkit provides audit scorecards, governance policy templates, and engineering SOPs.


Cyber Strategy Institute | ai-safe2-framework | March 2026

The question is not whether your AI governance framework exists on paper. The question is whether it executes at runtime. If the agent moves faster than the oversight, the system is ungoverned.

KERNEL-LEVEL DEFENSE 2025 A Buyers Guide