Discovery & Shadow AI

Automatically map your entire AI attack surface — cloud services, running agents, MCP servers, source code, and direct browser-based AI usage — before you can govern it.

Overview

Discovery runs continuously across your infrastructure, finding AI assets you may not know exist. Every discovered asset enters an approval workflow before it can be used by governed agents. Shadow AI detection catches direct AI usage (ChatGPT, Claude, etc.) happening outside your proxy.

Discovery works through channels — configured integrations with your infrastructure. Each channel type uses a different collection method and targets a different part of your environment.

Discovery Channels

Rivaro supports 10 discovery channel types. Configure them in Settings > Discovery Sources.

Channel Type	Mode	What it finds
`CLOUD_AI_SERVICES`	Scheduled	AWS SageMaker/Bedrock, GCP Vertex AI, Azure ML — endpoints, models, and AI-specific security risks
`SOURCE_CODE`	Scheduled	GitHub, GitLab, Bitbucket — AI libraries, hardcoded API keys, risky code patterns
`CONTAINER_REGISTRY`	Scheduled	Docker Hub, ECR, GCR, ACR — AI and MCP containers, vulnerability findings
`API_GATEWAY`	Scheduled	AWS API Gateway, Kong, Nginx — AI and MCP API endpoints
`COLLABORATION_PLATFORM`	Scheduled	Slack, Teams, Google Workspace — unauthorized AI bots, plugins, integrations
`IDENTITY_ACCESS`	Scheduled	Okta, Azure AD, AWS IAM — AI access patterns, service accounts with AI permissions
`LOG_ANALYSIS`	Hybrid	CloudWatch, Stackdriver, Splunk — shadow AI usage patterns from logs (API query or log forwarding agent)
`NETWORK_ENDPOINT`	Agent callback	Running MCP servers, shadow AI agents, exposed endpoints — requires a deployed network scanner agent
`AGENT_DATA`	Agent callback	Pre-collected data from client-side agents
`MANUAL_ENTRY`	Manual	Admin-created assets — auto-approved on creation

Channel configuration

Each channel has these common fields:

Field	Description
`name`	Display name for this channel
`channelType`	One of the types above
`active`	Whether the channel runs on its schedule
`pollingIntervalSeconds`	How often to run (scheduled channels)
`configuration`	Channel-specific settings (non-sensitive)
`lastRunAt`	Timestamp of most recent scan
`lastRunStatus`	SUCCESS, FAILED, or RUNNING
`lastRunAssetCount`	Assets found in last run
`lastRunRiskCount`	Risk findings in last run

note

Sensitive credentials (API keys, tokens, secrets) are stored separately in an encrypted credential store — never in the channel configuration JSON.

Network scanner agent

For NETWORK_ENDPOINT channels, Rivaro generates a downloadable Python agent. The agent embeds your detection key, scans your internal network, and POSTs results back to /api/admin/discovery/agent/results. Deploy it anywhere with network access to your internal AI infrastructure.

What Gets Discovered

Each discovered asset is classified by type and category:

Category	Examples
AI_SERVICE	OpenAI, Anthropic, Vertex AI endpoints in use
AI_MODEL	Deployed models, fine-tuned versions, model registries
DATA_STORAGE	Vector databases, embedding stores, training data repositories
ML_PIPELINE	Training pipelines, fine-tuning jobs, MLflow experiments
SOURCE_CODE	Repositories with AI dependencies or hardcoded keys
CONTAINER	Docker images with AI/MCP packages
IDENTITY_ACCESS	Service accounts, roles with AI service permissions
USAGE_PATTERN	Patterns of AI API calls detected in logs

Asset risk findings

Each discovered asset can have associated findings — specific security or compliance issues detected during scanning:

Finding field	Description
`detectionType`	e.g. `CREDENTIAL_EXPOSURE`, `MISCONFIGURATION`, `INFRASTRUCTURE_MCP_PUBLIC_ENDPOINT`
`severity`	CRITICAL, HIGH, MEDIUM, LOW
`status`	ACTIVE, RESOLVED, IGNORED
`description`	Human-readable description of the finding
`detectedContent`	What was found (masked in UI)
`remediationStatus`	NONE → PLAN_AVAILABLE → EXECUTION_IN_PROGRESS → SUCCESS

Asset Approval Workflow

Rivaro defaults to zero-trust / default-deny: every new asset starts as PENDING_APPROVAL. No agent can use an unapproved asset.

Approval lifecycle

Status	Meaning
`PENDING_APPROVAL`	Discovered, awaiting security team review
`APPROVED`	Reviewed and explicitly approved for use
`BLOCKED`	Reviewed and denied — agents cannot access
`ACTIVE`	Approved and currently in use by governed agents
`PROMOTED`	Graduated to a governed entity (agent, data source, model)
`REMOVED`	Asset no longer detected in environment
`ARCHIVED`	Deprecated, kept for audit history

The approval request includes a riskScore (0–100) calculated from the asset's findings. Reviewers can add notes before approving or denying.

Promoting an asset

Approved assets can be promoted — graduated into a fully governed entity with an AppContext, detection key, and full enforcement. This is how shadow infrastructure becomes official, monitored infrastructure.

Promoted entity type	What it becomes
AGENT	A registered agent identity with trust score tracking
DATA_SOURCE	A governed data source with access controls
MODEL	An approved model with allowed-model list enforcement
INTEGRATION	A governed integration with policy enforcement
SERVICE	An approved AI service endpoint

Multi-Source Correlation

The same asset may be discovered by multiple channels. Rivaro deduplicates using an externalId fingerprint — the same fingerprint from two channels links to one asset, with confidence increasing with each additional source.

Observation type	Confidence	How it's detected
DISCOVERED	SUSPECTED → INFERRED	Found by a scanner/channel scan
RUNTIME_USAGE	CONFIRMED	Seen in live agent traffic through the proxy
CODE_REFERENCE	INFERRED	Found in source code as an import or API call
IAM_POLICY	INFERRED	Service account has permission to access it

Shadow AI Detection

Shadow AI is direct use of AI services (ChatGPT, Claude, Perplexity, etc.) that bypasses your proxy — typically via a browser. The Rivaro Shadow AI browser extension monitors this activity and applies your policies in real time.

How it works

Install the Chrome extension and configure it with your organization's detection key
The extension monitors supported AI domains: chatgpt.com, claude.ai, bard.google.com, bing.com/chat, poe.com, perplexity.ai, and more
When a user types a prompt and submits it, the extension captures the content and sends it to Rivaro's detection engine (/v1/shadow)
Rivaro runs the same detection pipeline as the proxy — PII, PHI, credentials, prompt injection, etc.
The response action is applied directly in the browser

Shadow AI policy actions

Action	What the user sees
BLOCK	Modal appears, submission is prevented
REDACT	Modal shows sanitized version; user can copy and resubmit
LOG	Submission proceeds, violation is logged in the dashboard
ALLOW	No action, submission proceeds normally

Shadow AI analytics

The Shadow AI dashboard tracks:

Session trends — daily session counts and week-over-week change
Violations by severity — CRITICAL / HIGH / MEDIUM / LOW breakdown
Compliance rate — percentage of sessions with no violations
Risk users — top users by risk score and violation count
Cost exposure — estimated API cost of shadow usage, productivity hours
Compliance by framework — HIPAA, GDPR, and other framework-level metrics

Zero Trust inventory

Shadow AI detection surfaces an unverified asset inventory including:

Agent runtimes — LangChain, AutoGen, CrewAI instances running without governance
MCP servers — unauthenticated or public MCP endpoints
AI bots — Slack/Teams bots with excessive AI access
Public endpoints — ML infrastructure exposed to the internet

Dashboard

The Discovery dashboard (Discover in the nav) has seven tabs:

Tab	What's there
Explore	AI attack surface visualization graph
Findings	Grouped risk findings across all assets, filterable by severity and source
Recommended Actions	AI-generated remediation strategies prioritized by risk
Assets	Full asset inventory, filterable by category, status, and risk level
Sources	Discovery channels with status, last run, and asset counts
Scan History	Past discovery runs with results
Asset Risk Posture	Charts and analytics across the asset inventory

Next steps

Asset Management — Manage the approved asset inventory
Agent Management — Register and govern agents
Remediation — Fix discovery findings with AI-powered plans
Compliance Reporting — Use discovery data in compliance reports

Overview​

Discovery Channels​

Channel configuration​

Network scanner agent​

What Gets Discovered​

Asset risk findings​

Asset Approval Workflow​

Approval lifecycle​

Promoting an asset​

Multi-Source Correlation​

Shadow AI Detection​

How it works​

Shadow AI policy actions​

Shadow AI analytics​

Zero Trust inventory​

Dashboard​

Next steps​