IT Service & Operations Manual

Ransomware Detection

Behavioral signals, containment-oriented response, and the workflow teams need when destructive activity is suspected.

Audience: Security and operations teamsFocus: Destructive activity detection and responseStatus: Public manual

Scope

Ransomware response depends on speed, context, and confidence in the operating picture. This public guide keeps the workflow model and removes private detection and execution detail.

Overview

Ransomware Detection provides continuous monitoring of managed hosts for ransomware-indicative activity. The system uses six signal sources to compute a composite risk score per host, and can automatically contain threats when configured to do so.

Navigate to the Ransomware Detection page from the main navigation menu.

Activating a Policy

Activating a policy starts monitoring on all hosts in the policy’s scope.

  1. Select the policy from the list.
  2. Click Activate.

What happens on activation: - If entropy monitoring is enabled: an entropy_monitor daemon job is dispatched to each covered host. The agent begins scanning the configured directories at the specified interval. - If canary monitoring is enabled: a canary_deploy job is dispatched to each covered host. The agent creates realistic-looking decoy files (spreadsheets, documents, backups) in the configured locations.

The activation response shows how many jobs were dispatched and their types.

Deactivating a Policy

Click Deactivate to stop monitoring: - Canary removal jobs are dispatched to clean up deployed canary files. - Daemon stop jobs are dispatched to halt running entropy monitors.

Monitoring the Dashboard and Risk Scores

Dashboard

The dashboard provides an at-a-glance view of your ransomware detection posture:

  • Risk Distribution – count of hosts at each risk level (none, low, medium, high, critical)
  • Containment Status – count of hosts currently monitoring, isolating, or isolated
  • Active Policies – number of active vs total policies
  • Top Risk Hosts – the 10 hosts with the highest risk scores
  • Recent Activity – signal count and containment actions in the last 24 hours

Risk Scores

Each monitored host has a composite risk score (0.0-1.0) built from six signals:

Signal Weight What It Measures
File Entropy 30% High-entropy file writes (encryption indicator)
Canary Triggered 25% Canary file tampering (very high confidence)
Drift Indicators 15% System fingerprint changes correlated with ransomware (disabled AV, new admin accounts, firewall changes)
Exfiltration Risk 10% Network exfiltration alerts and risk scores
Process Anomaly 10% Deviations from process baselines
Health Degradation 10% Host health score drops

Risk levels: - None (< 0.20) – no indicators detected - Low (0.20 - 0.39) – minor indicators, worth monitoring - Medium (0.40 - 0.59) – multiple indicators, investigate promptly - High (0.60 - 0.79) – strong indicators, active investigation needed - Critical (>= 0.80) – very high confidence, containment recommended

Scores decay exponentially with a 30-minute half-life when signals stop arriving.

Per-Host Risk Score Detail

In the current repo UI, expanding a host row shows: - Organization and host-group scoped filtering before you open the host detail - Full signal breakdown (all 6 individual scores) - Risk score history (recent score changes and trigger signal) - Forensic timeline entries for that host - Recent entropy events and deployed canary state - Signal details (what triggered each score) - Containment status

The current page now surfaces the backend history/timeline data inline in the expanded host view. It still does not ship a standalone trend chart or separate history/timeline route.

Forensic Timeline

The backend/API supports host timeline reconstruction and incident timeline refresh, and the current frontend page now surfaces host timeline entries inline from the risk-score expansion flow plus incident-level timeline refresh. A dedicated standalone host timeline browser is not yet wired in this repo UI.

Threshold Overrides

Customize risk level thresholds at different scopes (most specific wins): 1. Individual host 2. Host group 3. Organization 4. Account-wide

You can also override signal weights to emphasize certain signals for specific environments (e.g., increase canary weight for file servers).

To manage overrides from the shipped UI: 1. Go to the Containment tab. 2. Review the existing override list and scope labels. 3. Click Add Override to create a new scope-specific threshold set. 5. Set the four score thresholds in ascending order and optionally override individual signal weights. 6. Click Create Override or Save Changes.

The same tab also supports: - Editing an existing override in place - Deleting an override after an explicit in-app confirmation - Refreshing the list without leaving the current tab

Dedicated effective-threshold inspection for a single host is still API-backed rather than shown as a separate operator route.

Canary Inventory and Events

The shipped UI now includes a dedicated Canaries tab, and the direct route the relevant workflow opens the same page with that tab selected.

Use this workflow to: - Filter deployed canary files by organization, host, and status - Review last verification time and deployment time - Trigger Verify Host for a host that should re-check canary integrity - Review canary trigger events independently from the risk-score expansion flow - Expand an event row to inspect process, PID, user, and event details

The Canaries tab has explicit loading, empty, and retryable error states, so a failed fetch does not silently collapse the workflow.

Entropy Events

The shipped UI now includes a dedicated Entropy Events tab, and the direct route the relevant workflow opens the same page with that tab selected.

Use this workflow to: - Filter entropy events by host, severity, and time window - Review affected-path, extension-change, and triggering-process summaries - Expand an event row to inspect the detailed affected paths, process list, and extension changes

If you enter an invalid time range (start after end), the page rejects it inline instead of sending a bad request.

Responding to Incidents

Ransomware Incidents

An incident is created when a host crosses into high or critical risk. Incidents track: - Kill chain phase: reconnaissance, lateral_movement, pre_encryption, active_encryption, post_encryption - Affected hosts: which hosts are involved - Blast radius: assessment of at-risk hosts (same org, same group, credential overlap, network flows) - Timeline: chronological events from detection through containment - Containment actions: what actions were taken and their status

Manual Host Isolation

To immediately isolate a suspect host: 1. Go to the Containment section or the incident detail. 2. Click Isolate on the target host.

This dispatches: 1. Evidence gather job (collects forensic data before isolation) 2. VM snapshot (if the host is a virtual machine) 3. Network isolation job (applies firewall rules to block all traffic except management and whitelisted IPs)

Manual isolation bypasses exclusion lists – it works on any host regardless of policy exclusions.

Releasing Isolation

After investigation is complete: 1. Click Release on the isolated host. 2. A release job is dispatched to remove the firewall rules.

Active Isolations

View all currently isolated hosts in the Active Isolations list, showing hostname, risk level, composite score, and when containment started.

Circuit Breaker

The circuit breaker prevents runaway auto-containment. If more than max_auto_actions_per_hour auto-isolation actions occur within an hour, the breaker trips: - All auto-containment pauses - Operators are notified - Manual isolation still works

To resume auto-containment after investigation: 1. Review the situation. 2. Click Reset Circuit Breaker.

Refreshing Incident Data

The incident detail panel provides two refresh actions for active investigations:

  1. Refresh Timeline – Rebuilds the forensic timeline by querying live risk score changes, entropy events, canary triggers, and containment actions across ALL affected hosts. Updates the kill chain phase assessment.
  2. Refresh Blast Radius – Reassesses the blast radius by checking organizational, host-group, credential, and network flow overlap for ALL affected hosts. Shows total hosts at risk.

These buttons appear in the expanded incident detail row.

Recovery from Backup

When an incident is contained and you need to restore: 1. Open the incident detail. 2. Click Clean Restore Points to see available backups. - Only backups completed BEFORE the first indicator timestamp are shown. - Verified (integrity-passed) backups are preferred. 3. Select a restore point and choose the restore type: - Files – restore specific files/directories - Image – full image restore 4. Click Recover to dispatch the restore job.

The incident status automatically transitions to “recovering” with the recovery timestamp set.

Updating Incident Status

Update incident fields as the investigation progresses: - Status: active -> contained -> recovering -> resolved (or false_positive) - Entry Point Hypothesis: document how the attack started - Kill Chain Phase: track progression through the kill chain

Status transitions auto-set timestamps (containment_completed_at, recovery_started_at, recovery_completed_at).

Running Simulations

Simulations validate your detection pipeline without triggering real containment.

Create a Simulation

  1. Go to the Simulations tab.
  2. Click Create Simulation.
  3. Configure:
  4. Name – descriptive name
  5. Scenario Type:
    • encryption_only – simulates pure encryption activity
    • double_extortion – simulates data exfiltration followed by encryption
    • wiper – simulates destructive (non-ransom) attack
    • targeted – simulates targeted attack on specific files/hosts
  6. Target Hosts – select hosts to inject simulated signals into
  7. Expected Actions – what you expect the system to do (for result comparison)
  8. Click Start.

Scope rules: - You can only target hosts in organizations you can access. - If you set an explicit organization, every selected host must belong to that same organization.

What Happens During a Simulation

  1. Simulated entropy events and canary events are injected into the target hosts (all tagged is_simulation=True).
  2. The scoring engine processes the simulated data.
  3. The engine evaluates what containment actions it would take (but does NOT dispatch them).
  4. Results are recorded: actual_actions vs expected_actions.

Reviewing Results

Open a completed simulation to see: - What actions the engine would have taken - Whether they matched your expectations - A results summary

Simulation status values: - completed — simulation ran successfully and results are available - failed — simulation encountered an internal error and did not produce reliable results. Check server logs for details. Do NOT treat a failed simulation as a passing drill. - cancelled — simulation was manually stopped before completion - pending / running — simulation is queued or in progress

Cancelling a Simulation

Cancel a pending or running simulation to stop it and clean up all simulation-tagged data.

Understanding Readiness Metrics

The Readiness Report assesses your detection posture:

Detection Coverage

  • Total hosts in your account
  • Hosts with active policy coverage (have a risk score record)
  • Hosts with entropy monitoring running
  • Hosts with canary files deployed
  • Overall coverage percentage

Policy Summary

  • Total policies
  • Active policies
  • Policies with auto-isolate enabled

Historical Incidents

  • Total incidents
  • Resolved incidents
  • False positive rate
  • Average detection time (first indicator to detection)
  • Average containment time (first indicator to containment complete)

Recommendations

The system generates actionable recommendations based on your current state: - Low coverage: expand policy scope - No active policies: create and activate one - No auto-isolate: consider enabling for critical hosts - No canaries: deploy canary files for high-confidence detection - High false positive rate: review thresholds and whitelists - Good coverage: consider running a simulation to validate response procedures

Permissions

Action Required Permission
View dashboard, risk scores, policies, events, reports ransomware.view
Create/edit/delete policies, manage thresholds, verify canaries, run simulations ransomware.manage
Isolate/release hosts, update incidents, reset circuit breaker, initiate recovery ransomware.respond