Skip to main content

Understanding Detections

Complete reference for everything Rivaro detects: 7 risk domains, 15 risk categories, 60+ detection types, severity levels, lifecycle stages, and data classifications.

Detection Taxonomy

Every detection is classified along four dimensions:

DimensionWhat it answersValues
Risk DomainWhat area of risk?7 domains
Risk CategoryWhat specific risk pattern?15 categories
Detection TypeWhat exactly was found?60+ types
SeverityHow serious?LOW, MEDIUM, HIGH, CRITICAL

Additionally, detections are tagged with:

DimensionWhat it answersValues
LifecycleWhere in the data flow?INGRESS, EGRESS, DEPLOYMENT, TRAINING
Data ClassificationWhat kind of data?PII, PHI, FINANCIAL, CREDENTIALS, IP, NONE

Risk Domains

Risk domains are the highest-level grouping — seven areas of AI risk.

DATA_PROTECTION

Data Protection Risk — Unauthorized access, exposure, or movement of sensitive data. This is the most common risk domain. Covers PII, PHI, financial data, credentials, and any sensitive information flowing through AI systems.

SYSTEM_INTEGRITY

System Integrity Risk — Unauthorized modification of infrastructure, configurations, or production systems. Covers agents making unauthorized changes: shell commands, database writes, file modifications, infrastructure misconfigurations.

AUTONOMOUS_ACTION

Autonomous Action Risk — High-impact state transitions executed without appropriate control. Covers AI-initiated financial actions, irreversible decisions (account closures, contract execution), and actions that should require human approval.

IDENTITY_TRUST

Identity & Trust Boundary Risk — Agents operating outside declared roles, scopes, or trust boundaries. Covers role escalation, agents acting outside their intended scope, and unapproved external communications.

ADVERSARIAL

Adversarial Manipulation Risk — External inputs altering agent behavior or bypassing controls. Covers prompt injection, jailbreaks, policy evasion attempts, and any technique designed to manipulate the AI system.

CONTENT_SAFETY

Content Safety Risk — Violations of safety guidelines, content policies, or ethical boundaries. Covers toxic content, hate speech, harmful instructions, bias, and self-harm content — whether generated or ingested.

GOVERNANCE

Governance & Shadow Risk — AI usage occurring outside approved or monitored infrastructure. Covers shadow AI — unregistered models, unapproved tools, and AI activity happening outside the organization's managed infrastructure.

Risk Categories

Each risk domain contains specific risk categories (15 total).

Data Protection

CategoryDescriptionLifecycle
EXTERNAL_DATA_EXFILTRATIONSensitive data leaving approved boundaries via external tools, APIs, or channelsEGRESS, TRAINING
SENSITIVE_DATA_BOUNDARY_VIOLATIONData accessed outside intended domain, role, or dataset scopeINGRESS, EGRESS
CROSS_AGENT_DATA_LEAKAGEImproper data passing between agents or models without guardrailsINGRESS, EGRESS

System Integrity

CategoryDescriptionLifecycle
UNAUTHORIZED_SYSTEM_MODIFICATIONChanges to production systems, CI/CD, configs, or repos without approvalEGRESS
PRIVILEGED_TOOL_MISUSEHigh-impact tools (shell, DB write, admin APIs) invoked outside allowed scopeEGRESS
INFRASTRUCTURE_MISCONFIGURATIONCloud or system security misconfigurations detected during discoveryDEPLOYMENT
OPERATIONAL_ANOMALYRate limits exceeded, resource abuse, behavioral drift, availability threatsINGRESS, EGRESS

Autonomous Action

CategoryDescriptionLifecycle
AUTONOMOUS_FINANCIAL_ACTIONAI-initiated financial movement (payment rails, token transfer, treasury API)EGRESS
HIGH_RISK_AUTONOMOUS_DECISIONIrreversible state transition without human approval (contract, account closure)EGRESS

Identity & Trust

CategoryDescriptionLifecycle
IDENTITY_ROLE_ESCALATIONAgent acting outside assigned persona, scope, or declared capabilitiesINGRESS, EGRESS
UNAPPROVED_EXTERNAL_COMMUNICATIONData sent externally without policy alignment (Slack, email, webhook)EGRESS

Adversarial

CategoryDescriptionLifecycle
PROMPT_INJECTION_EXPLOITAgent behavior altered by malicious or untrusted input (system prompt override, RAG manipulation)INGRESS
POLICY_EVASION_ATTEMPTDeliberate attempt to bypass controls (encoding, fragmentation, retry loops)INGRESS, EGRESS

Content Safety

CategoryDescriptionLifecycle
AI_SAFETY_VIOLATIONToxic content, hate speech, bias, self-harm encouragement, or harmful contentINGRESS, EGRESS

Governance

CategoryDescriptionLifecycle
UNREGISTERED_SHADOW_AIAI activity outside approved adapters, agents, or infrastructureDEPLOYMENT

Detection Types

Detection types are the most granular level — the specific thing that was found.

PII (Personally Identifiable Information)

TypeWhat it catches
PII_EMAILEmail addresses
PII_SSNSocial Security numbers
PII_PHONEPhone numbers
PII_ADDRESSPhysical addresses
PII_DATE_OF_BIRTHDates of birth
PII_FULL_NAMEFull names
PII_DRIVERS_LICENSEDriver's license numbers
PII_PASSPORTPassport numbers
PII_CREDIT_CARDCredit card numbers

PHI (Protected Health Information)

TypeWhat it catches
PHI_MEDICAL_RECORDMedical record numbers
PHI_HEALTH_INSURANCEHealth insurance IDs
PHI_PRESCRIPTIONPrescription information
PHI_DIAGNOSISMedical diagnoses
PHI_TREATMENTTreatment details

Financial

TypeWhat it catches
FINANCIAL_BANK_ACCOUNTBank account numbers
FINANCIAL_ROUTINGRouting numbers
FINANCIAL_INVESTMENTInvestment account details
FINANCIAL_TAX_IDTax identification numbers

Credentials & Secrets

TypeWhat it catches
CREDENTIALS_API_KEYAPI keys
CREDENTIALS_PASSWORDPasswords
CREDENTIALS_TOKENAuthentication tokens
CREDENTIALS_SSH_KEYSSH keys
CREDENTIALS_AWS_KEYAWS access keys

Intellectual Property

TypeWhat it catches
IP_TRADEMARKTrademark content
IP_COPYRIGHTCopyrighted material
IP_PATENTPatent information
IP_TRADE_SECRETTrade secrets

Security

TypeWhat it catches
SECURITY_PROMPT_INJECTIONPrompt injection attacks
SECURITY_JAILBREAKJailbreak attempts
SECURITY_DATA_EXFILTRATIONData exfiltration patterns
SECURITY_DLP_BYPASSDLP bypass attempts
SECURITY_AUTHENTICATION_BYPASSAuthentication bypass
SECURITY_RESOURCE_ABUSEResource abuse patterns
SECURITY_TOXIC_CONTENTToxic or harmful content
SECURITY_HARMFUL_INSTRUCTIONSInstructions for harmful activities
SECURITY_SYSTEM_INPUT_LEAKAGESystem prompt leakage

Agent Tool Use

TypeWhat it catches
AGENT_TOOL_CALLGeneric tool invocation
AGENT_TOOL_SHELL_EXECShell command execution
AGENT_TOOL_FILE_WRITEFile write operations
AGENT_TOOL_FILE_DELETEFile deletion
AGENT_TOOL_FILE_READFile read operations
AGENT_TOOL_FILE_LISTDirectory listing
AGENT_TOOL_DATABASE_WRITEDatabase write operations
AGENT_TOOL_DATABASE_QUERYDatabase queries
AGENT_TOOL_CODE_EDITCode modifications
AGENT_TOOL_CODE_SEARCHCode search operations
AGENT_TOOL_WEB_FETCHHTTP fetch operations
AGENT_TOOL_WEB_SEARCHWeb search
AGENT_TOOL_WEB_BROWSEWeb browsing
AGENT_TOOL_MCPMCP tool invocations

Behavioral

TypeWhat it catches
BEHAVIORAL_LATENCY_DRIFTUnusual latency patterns
BEHAVIORAL_TOKEN_DRIFTUnusual token usage patterns
BEHAVIORAL_RESPONSE_LENGTH_DRIFTResponse length anomalies
BEHAVIORAL_RATE_LIMIT_EXCEEDEDRate limit violations
BEHAVIORAL_VOLUME_SPIKEUnusual request volume
BEHAVIORAL_TOOL_ABUSERepeated or escalating tool use
BEHAVIORAL_ATTACK_CHAINCoordinated attack patterns
BEHAVIORAL_NOVEL_ATTACK_PATTERNPreviously unseen attack patterns

System

TypeWhat it catches
SYSTEM_RATE_LIMIT_EXCEEDEDSystem rate limit hit
SYSTEM_CONCURRENT_LIMIT_EXCEEDEDConcurrent request limit
SYSTEM_PAYLOAD_SIZE_EXCEEDEDPayload too large
SYSTEM_ACTOR_QUARANTINED_ACCESSQuarantined actor attempting access
SYSTEM_ACTOR_TERMINATED_ACCESSTerminated actor attempting access
SYSTEM_INVALID_KEYInvalid detection key used
SYSTEM_ORG_SUSPENDEDSuspended org attempting access

Severity Levels

LevelMeaningExamples
LOWMinor finding, informationalEmail address detected, file read operation
MEDIUMNotable finding, may require attentionPhone number detected, database query
HIGHSignificant risk, likely needs actionSSN detected, shell execution, prompt injection
CRITICALSevere risk, immediate action neededCredential exfiltration, jailbreak, attack chain

Lifecycle Stages

StageWhenWhat's scanned
INGRESSBefore the request is sent to the AI providerUser/agent prompts, input content
EGRESSAfter the AI provider respondsAI responses, tool call results
DEPLOYMENTDuring infrastructure scanningCloud configs, model registrations, shadow AI
TRAININGDuring training data pipelinesTraining data, fine-tuning inputs

Most enforcement happens at INGRESS (block bad inputs) and EGRESS (catch sensitive data in outputs). DEPLOYMENT and TRAINING are primarily for discovery and compliance scanning.

Data Classifications

ClassificationDescriptionSensitive
PIIPersonally Identifiable Information — SSN, email, phone, name, DOBYes
PHIProtected Health Information — medical records, insurance, prescriptionsYes
FINANCIALFinancial Data — bank accounts, routing numbers, tax IDsYes
CREDENTIALSCredentials & Secrets — API keys, passwords, SSH keys, tokensYes
INTELLECTUAL_PROPERTYIntellectual Property — trademarks, copyrights, patents, trade secretsYes
NONENot data-classified — prompt injection, tool calls, infrastructure findingsNo

Next steps