Understanding Detections

Complete reference for everything Rivaro detects: 7 risk domains, 15 risk categories, 60+ detection types, severity levels, lifecycle stages, and data classifications.

Detection Taxonomy

Every detection is classified along four dimensions:

Dimension	What it answers	Values
Risk Domain	What area of risk?	7 domains
Risk Category	What specific risk pattern?	15 categories
Detection Type	What exactly was found?	60+ types
Severity	How serious?	LOW, MEDIUM, HIGH, CRITICAL

Additionally, detections are tagged with:

Dimension	What it answers	Values
Lifecycle	Where in the data flow?	INGRESS, EGRESS, DEPLOYMENT, TRAINING
Data Classification	What kind of data?	PII, PHI, FINANCIAL, CREDENTIALS, IP, NONE

Risk Domains

Risk domains are the highest-level grouping — seven areas of AI risk.

DATA_PROTECTION

Data Protection Risk — Unauthorized access, exposure, or movement of sensitive data. This is the most common risk domain. Covers PII, PHI, financial data, credentials, and any sensitive information flowing through AI systems.

SYSTEM_INTEGRITY

System Integrity Risk — Unauthorized modification of infrastructure, configurations, or production systems. Covers agents making unauthorized changes: shell commands, database writes, file modifications, infrastructure misconfigurations.

AUTONOMOUS_ACTION

Autonomous Action Risk — High-impact state transitions executed without appropriate control. Covers AI-initiated financial actions, irreversible decisions (account closures, contract execution), and actions that should require human approval.

IDENTITY_TRUST

Identity & Trust Boundary Risk — Agents operating outside declared roles, scopes, or trust boundaries. Covers role escalation, agents acting outside their intended scope, and unapproved external communications.

ADVERSARIAL

Adversarial Manipulation Risk — External inputs altering agent behavior or bypassing controls. Covers prompt injection, jailbreaks, policy evasion attempts, and any technique designed to manipulate the AI system.

CONTENT_SAFETY

Content Safety Risk — Violations of safety guidelines, content policies, or ethical boundaries. Covers toxic content, hate speech, harmful instructions, bias, and self-harm content — whether generated or ingested.

GOVERNANCE

Governance & Shadow Risk — AI usage occurring outside approved or monitored infrastructure. Covers shadow AI — unregistered models, unapproved tools, and AI activity happening outside the organization's managed infrastructure.

Risk Categories

Each risk domain contains specific risk categories (15 total).

Data Protection

Category	Description	Lifecycle
`EXTERNAL_DATA_EXFILTRATION`	Sensitive data leaving approved boundaries via external tools, APIs, or channels	EGRESS, TRAINING
`SENSITIVE_DATA_BOUNDARY_VIOLATION`	Data accessed outside intended domain, role, or dataset scope	INGRESS, EGRESS
`CROSS_AGENT_DATA_LEAKAGE`	Improper data passing between agents or models without guardrails	INGRESS, EGRESS

System Integrity

Category	Description	Lifecycle
`UNAUTHORIZED_SYSTEM_MODIFICATION`	Changes to production systems, CI/CD, configs, or repos without approval	EGRESS
`PRIVILEGED_TOOL_MISUSE`	High-impact tools (shell, DB write, admin APIs) invoked outside allowed scope	EGRESS
`INFRASTRUCTURE_MISCONFIGURATION`	Cloud or system security misconfigurations detected during discovery	DEPLOYMENT
`OPERATIONAL_ANOMALY`	Rate limits exceeded, resource abuse, behavioral drift, availability threats	INGRESS, EGRESS

Autonomous Action

Category	Description	Lifecycle
`AUTONOMOUS_FINANCIAL_ACTION`	AI-initiated financial movement (payment rails, token transfer, treasury API)	EGRESS
`HIGH_RISK_AUTONOMOUS_DECISION`	Irreversible state transition without human approval (contract, account closure)	EGRESS

Identity & Trust

Category	Description	Lifecycle
`IDENTITY_ROLE_ESCALATION`	Agent acting outside assigned persona, scope, or declared capabilities	INGRESS, EGRESS
`UNAPPROVED_EXTERNAL_COMMUNICATION`	Data sent externally without policy alignment (Slack, email, webhook)	EGRESS

Adversarial

Category	Description	Lifecycle
`PROMPT_INJECTION_EXPLOIT`	Agent behavior altered by malicious or untrusted input (system prompt override, RAG manipulation)	INGRESS
`POLICY_EVASION_ATTEMPT`	Deliberate attempt to bypass controls (encoding, fragmentation, retry loops)	INGRESS, EGRESS

Content Safety

Category	Description	Lifecycle
`AI_SAFETY_VIOLATION`	Toxic content, hate speech, bias, self-harm encouragement, or harmful content	INGRESS, EGRESS

Governance

Category	Description	Lifecycle
`UNREGISTERED_SHADOW_AI`	AI activity outside approved adapters, agents, or infrastructure	DEPLOYMENT

Detection Types

Detection types are the most granular level — the specific thing that was found.

PII (Personally Identifiable Information)

Type	What it catches
`PII_EMAIL`	Email addresses
`PII_SSN`	Social Security numbers
`PII_PHONE`	Phone numbers
`PII_ADDRESS`	Physical addresses
`PII_DATE_OF_BIRTH`	Dates of birth
`PII_FULL_NAME`	Full names
`PII_DRIVERS_LICENSE`	Driver's license numbers
`PII_PASSPORT`	Passport numbers
`PII_CREDIT_CARD`	Credit card numbers

PHI (Protected Health Information)

Type	What it catches
`PHI_MEDICAL_RECORD`	Medical record numbers
`PHI_HEALTH_INSURANCE`	Health insurance IDs
`PHI_PRESCRIPTION`	Prescription information
`PHI_DIAGNOSIS`	Medical diagnoses
`PHI_TREATMENT`	Treatment details

Financial

Type	What it catches
`FINANCIAL_BANK_ACCOUNT`	Bank account numbers
`FINANCIAL_ROUTING`	Routing numbers
`FINANCIAL_INVESTMENT`	Investment account details
`FINANCIAL_TAX_ID`	Tax identification numbers

Credentials & Secrets

Type	What it catches
`CREDENTIALS_API_KEY`	API keys
`CREDENTIALS_PASSWORD`	Passwords
`CREDENTIALS_TOKEN`	Authentication tokens
`CREDENTIALS_SSH_KEY`	SSH keys
`CREDENTIALS_AWS_KEY`	AWS access keys

Intellectual Property

Type	What it catches
`IP_TRADEMARK`	Trademark content
`IP_COPYRIGHT`	Copyrighted material
`IP_PATENT`	Patent information
`IP_TRADE_SECRET`	Trade secrets

Security

Type	What it catches
`SECURITY_PROMPT_INJECTION`	Prompt injection attacks
`SECURITY_JAILBREAK`	Jailbreak attempts
`SECURITY_DATA_EXFILTRATION`	Data exfiltration patterns
`SECURITY_DLP_BYPASS`	DLP bypass attempts
`SECURITY_AUTHENTICATION_BYPASS`	Authentication bypass
`SECURITY_RESOURCE_ABUSE`	Resource abuse patterns
`SECURITY_TOXIC_CONTENT`	Toxic or harmful content
`SECURITY_HARMFUL_INSTRUCTIONS`	Instructions for harmful activities
`SECURITY_SYSTEM_INPUT_LEAKAGE`	System prompt leakage

Agent Tool Use

Type	What it catches
`AGENT_TOOL_CALL`	Generic tool invocation
`AGENT_TOOL_SHELL_EXEC`	Shell command execution
`AGENT_TOOL_FILE_WRITE`	File write operations
`AGENT_TOOL_FILE_DELETE`	File deletion
`AGENT_TOOL_FILE_READ`	File read operations
`AGENT_TOOL_FILE_LIST`	Directory listing
`AGENT_TOOL_DATABASE_WRITE`	Database write operations
`AGENT_TOOL_DATABASE_QUERY`	Database queries
`AGENT_TOOL_CODE_EDIT`	Code modifications
`AGENT_TOOL_CODE_SEARCH`	Code search operations
`AGENT_TOOL_WEB_FETCH`	HTTP fetch operations
`AGENT_TOOL_WEB_SEARCH`	Web search
`AGENT_TOOL_WEB_BROWSE`	Web browsing
`AGENT_TOOL_MCP`	MCP tool invocations

Behavioral

Type	What it catches
`BEHAVIORAL_LATENCY_DRIFT`	Unusual latency patterns
`BEHAVIORAL_TOKEN_DRIFT`	Unusual token usage patterns
`BEHAVIORAL_RESPONSE_LENGTH_DRIFT`	Response length anomalies
`BEHAVIORAL_RATE_LIMIT_EXCEEDED`	Rate limit violations
`BEHAVIORAL_VOLUME_SPIKE`	Unusual request volume
`BEHAVIORAL_TOOL_ABUSE`	Repeated or escalating tool use
`BEHAVIORAL_ATTACK_CHAIN`	Coordinated attack patterns
`BEHAVIORAL_NOVEL_ATTACK_PATTERN`	Previously unseen attack patterns

System

Type	What it catches
`SYSTEM_RATE_LIMIT_EXCEEDED`	System rate limit hit
`SYSTEM_CONCURRENT_LIMIT_EXCEEDED`	Concurrent request limit
`SYSTEM_PAYLOAD_SIZE_EXCEEDED`	Payload too large
`SYSTEM_ACTOR_QUARANTINED_ACCESS`	Quarantined actor attempting access
`SYSTEM_ACTOR_TERMINATED_ACCESS`	Terminated actor attempting access
`SYSTEM_INVALID_KEY`	Invalid detection key used
`SYSTEM_ORG_SUSPENDED`	Suspended org attempting access

Severity Levels

Level	Meaning	Examples
LOW	Minor finding, informational	Email address detected, file read operation
MEDIUM	Notable finding, may require attention	Phone number detected, database query
HIGH	Significant risk, likely needs action	SSN detected, shell execution, prompt injection
CRITICAL	Severe risk, immediate action needed	Credential exfiltration, jailbreak, attack chain

Lifecycle Stages

Stage	When	What's scanned
INGRESS	Before the request is sent to the AI provider	User/agent prompts, input content
EGRESS	After the AI provider responds	AI responses, tool call results
DEPLOYMENT	During infrastructure scanning	Cloud configs, model registrations, shadow AI
TRAINING	During training data pipelines	Training data, fine-tuning inputs

Most enforcement happens at INGRESS (block bad inputs) and EGRESS (catch sensitive data in outputs). DEPLOYMENT and TRAINING are primarily for discovery and compliance scanning.

Data Classifications

Classification	Description	Sensitive
PII	Personally Identifiable Information — SSN, email, phone, name, DOB	Yes
PHI	Protected Health Information — medical records, insurance, prescriptions	Yes
FINANCIAL	Financial Data — bank accounts, routing numbers, tax IDs	Yes
CREDENTIALS	Credentials & Secrets — API keys, passwords, SSH keys, tokens	Yes
INTELLECTUAL_PROPERTY	Intellectual Property — trademarks, copyrights, patents, trade secrets	Yes
NONE	Not data-classified — prompt injection, tool calls, infrastructure findings	No

Next steps

Enforcement & Policies — Configure what happens when detections occur
Error Handling — How enforcement appears to developers
Configuration Guide — Set up detection and policy rules

Detection Taxonomy​

Risk Domains​

DATA_PROTECTION​

SYSTEM_INTEGRITY​

AUTONOMOUS_ACTION​

IDENTITY_TRUST​

ADVERSARIAL​

CONTENT_SAFETY​

GOVERNANCE​

Risk Categories​

Data Protection​

System Integrity​

Autonomous Action​

Identity & Trust​

Adversarial​

Content Safety​

Governance​

Detection Types​

PII (Personally Identifiable Information)​

PHI (Protected Health Information)​

Financial​

Credentials & Secrets​

Intellectual Property​

Security​

Agent Tool Use​

Behavioral​

System​

Severity Levels​

Lifecycle Stages​

Data Classifications​

Next steps​