Prerequisites
- Role with patches.view permission for read-only access
- Role with patches.manage permission for creating policies, ring sets, deployments, and approving/rejecting patches
- Role with patches.deploy permission for skipping individual deployment hosts
- Role with hosts.view / hosts.manage for service group management
- At least one organization with managed hosts that have agents installed and reporting heartbeats
- Hosts must be sending pending patch data via host-info collection (automatic with agent v1.7.0+)
Creating a patch policy
Patch policies control auto-approval rules, safety thresholds, and pre-flight behavior. Policies are org-scoped -- one active policy per organization.
- Navigate to Patch Management > Policies
- Click Create Policy
- Select the target organization
- Configure the fields below
- Click Save
Policy fields
Policy overrides
Overrides let you customize policy settings for specific locations or host groups without creating entirely separate policies. Overrides use sparse merge -- only fields you specify override the base policy. All other fields inherit from the base.
- Open the policy detail page
- Click Add Override
- Select the target: a location or host group
- Set only the fields you want to override (leave others blank to inherit)
- Set the priority number -- higher priority wins when multiple overrides apply to the same host
- Click Save
Creating maintenance windows
Maintenance windows define when patching and reboots are permitted. Hosts configured with the "Maintenance Window" install schedule will only begin patching inside an active window.
- Navigate to Maintenance Windows
- Click Create Window
- Enter a name and select the organization
- Choose the schedule type
- Set the start and end times (24-hour format)
- Set the timezone (e.g.,
America/New_York) -- all MW times are evaluated in this timezone - Add targets: select host groups or individual hosts
- Click Preview Schedule to verify the next 3 occurrences
- Click Save
Schedule types
| Type | Description | Example |
|---|---|---|
| Nth Day | The Nth occurrence of a weekday each month | "3rd Sunday of every month" |
| Relative | A weekday relative to another weekday occurrence | "First Saturday after the 2nd Sunday" |
| Last Day | The last occurrence of a weekday each month | "Last Friday of every month" |
Creating ring sets and rings
Ring sets define the phased rollout structure for patch deployments. Each ring set contains ordered rings that deploy sequentially.
- Navigate to Patch Management > Ring Sets
- Click Create Ring Set
- Enter a name (e.g., "Linux Servers" or "Windows Workstations")
- Toggle Auto-deploy if approved patches should automatically create deployments
- Optionally set a classification filter to limit to security-only, critical-only, etc.
- Optionally configure script hooks: pre-script, post-script, synthetic test script
- Click Save
Adding rings
Add rings in deployment order (ring 0 deploys first). A typical three-ring setup:
| Ring | Name | Canary | Wait | Success Gate |
|---|---|---|---|---|
| 0 | Canary | count=2 | 4 hours | 100% |
| 1 | Early Adopters | -- | 24h cooloff | 95% |
| 2 | Production | -- | -- | 95% |
For each ring, add members: select host groups, individual hosts, or service groups.
Ring configuration
Install schedule modes
Each ring has an install schedule that controls when patching begins for hosts in that ring.
| Mode | Behavior |
|---|---|
| Immediate | Patching begins immediately when the ring activates. No maintenance window check. |
| Delay from Approval | Wait N days from deployment creation. After delay, optionally gate on maintenance window if enabled. |
| Delay from Prior Ring | Wait N days from the prior ring's completion. After delay, optionally gate on maintenance window. |
| Maintenance Window | Wait for host to be within an active maintenance window. The scheduled time is set to the next window start time. This is the default. |
Reboot policy
Each ring has its own reboot policy controlling what happens after patches install:
Canary settings
Service groups
Service groups enable coordinated patching of dependent servers in multi-tier applications (e.g., Web → App → DB). Tiers patch in dependency order to avoid breaking service dependencies.
- Navigate to Service Groups in the Policy sidebar section
- Click Create Service Group
- Enter a name (e.g., "Production ERP Stack") and select the organization
- Add tiers in dependency order (tier 0 patches first)
Tier configuration
Each tier references a host group -- the hosts in that group become the tier's members.
Example tier setup
| Tier | Name | Host Group | Max Concurrent | Success Gate |
|---|---|---|---|---|
| 0 | Web Servers | Web Servers | 2 | 100% |
| 1 | App Servers | App Servers | 1 | 100% |
| 2 | Database | DB Servers | 1 | 100% |
Add the service group as a ring member (same as adding a host group). During deployment:
- Service group hosts are never canary -- tier ordering IS the validation mechanism
- Tier 0 completes first, then tier 1, etc.
- If a tier fails its success gate, the deployment pauses
- The Max Concurrent setting controls how many hosts patch simultaneously within a tier
Managing available patches
As agents report pending patches, they appear in Patch Management > Available Patches. Patches matching auto-approve rules are approved automatically (shown as "System" in the approved-by column). Others stay in "Pending" status for manual review.
Manual actions
| Action | Effect |
|---|---|
| Approve | Marks the patch as approved for deployment |
| Reject | Rejects the patch with a required reason. It will not be deployed. |
| Defer | Defers the patch until a specified date. It returns to pending after the deferral expires. |
| Bulk Approve | Approve multiple patches at once |
| Bulk Reject | Reject multiple patches at once |
Creating a deployment
If auto-deploy is enabled on a ring set, deployments create automatically when patches are approved. For manual deployments:
- Navigate to Patch Management > Deployments
- Click Create Deployment
- Select the ring set
- Choose patches to include
- Configure overrides if needed (max host retries, max duration, pre-download, script hooks)
- Click Save -- deployment starts in Draft status
- Click Start Deployment
Deployment options
Deployment lifecycle
When you start a deployment, the engine processes it through the following phases:
- Ring 0 expansion: The engine expands ring 0 members into individual host rows, skipping empty rings automatically
- Canary selection: The first N hosts are marked as canaries
- Canary patching: Canary hosts begin the per-host pipeline immediately
- Canary wait: After all canaries complete, the system waits for the configured canary wait period
- Remaining ring hosts: Non-canary hosts begin patching (respecting install schedule mode)
- Success gate check: When all ring hosts finish, the success rate is evaluated against the configured success gate percentage
- Cooloff: The system waits for the configured cooloff period before advancing
- Variance detection: For ring N > 0, if patches exist that were not tested in any previous ring and "Block Untested Patches" is enabled, deployment enters "Variance Approval Required" status
- Next ring: Steps 1-8 repeat for each subsequent ring
- Completion: When all rings finish, deployment moves to "Completed" (or "Completed with Failures" if any hosts failed/timed out/were skipped)
Per-host pipeline
Each host goes through this sequence (phases are driven by job completion callbacks):
| Phase | Host Status | Job Type | Notes |
|---|---|---|---|
| Pre-download | Downloading | Patch download | Optional. Downloads patches before MW. |
| MW enforcement | Scheduled | -- | Waits for MW if required by ring schedule |
| Pre-flight | Pre-flight | Pre-flight check | Captures system state, checks disk space |
| Snapshot | Snapshot | VM snapshot | Optional. VM snapshot before patching. |
| Pre-script | Pre-script | Script | Optional. Custom pre-patch script. |
| Install | Installing | Patch install | OS-specific patch installation |
| Post-script | Post-script | Script | Optional. Custom post-patch script. |
| Post-flight | Validating | Post-flight check | Captures post-install state for diff |
| State diff | -- | -- | Compares pre/post fingerprints |
| Synthetic test | -- | Script | Optional. Application validation. |
| Done | Completed | -- | Triggers ring advancement check |
Monitoring deployment progress
The deployment detail page shows real-time status across all rings and hosts.
Deployment statuses
Host statuses to watch for
Variance detection
If a higher ring has patches that were not tested in any previous ring and the policy has "Block Untested Patches" enabled, the deployment enters "Variance Approval Required" status. Click Approve Variance to continue, or disable "Block Untested Patches" in the policy.
Deployment actions
| Action | When available | Effect |
|---|---|---|
| Pause | Running | Pauses the deployment. In-progress hosts finish their current phase. No new hosts start. |
| Resume | Paused | Resumes processing from where it stopped. |
| Cancel | Running, Paused | Cancels the deployment with optional reason. In-progress hosts finish their current phase. Pending hosts are skipped. |
| Redeploy | Cancelled | Resets to Draft status. All host rows are deleted and recreated on start. |
| Approve Variance | Variance required | Approves untested patches for the current ring and resumes deployment. |
| Skip Host | Any non-terminal host | Skips a single host with optional reason. Requires patches.deploy. Allows the ring to advance. |
| Retry Host | Failed host | Resets the host to "Pending" and restarts the full pipeline from scratch. |
| Rollback Host | Completed or failed host | Creates a patch uninstall job that reverses patches in reverse install order. Host transitions through "Rolling Back" to "Rolled Back". |
Auto-retry on failure
When a host fails during any pipeline phase, the system automatically retries up to the configured Max Host Retries (default: 1).
- Retry resets the host to "Pending" and restarts the entire pipeline from scratch (download, preflight, install, etc.)
- All per-run state (fingerprints, jobs, error messages) is cleared
- Retries are tracked and visible on the host detail
- Auto-retry does NOT apply to rollback failures (prevents infinite loops)
- The circuit breaker still fires for install-phase failures even during retries
Circuit breaker
The circuit breaker protects against widespread failures from a bad patch. When a patch KB fails on 3+ hosts across the entire account (default threshold), the system auto-rejects that KB in all organizations.
- Threshold is configurable per policy via the Global Failure Threshold setting
- Counts failed and rolled-back hosts for the same KB across all account organizations
- Once triggered, the KB is rejected and no further deployment attempts are made
- To re-deploy after fixing the root cause, manually re-approve the KB
Deployment auto-cancellation
Two automatic cancellation mechanisms protect against stuck deployments:
| Mechanism | Default | Behavior |
|---|---|---|
| Max duration | 72 hours | Deployments running longer than the configured Max Duration are auto-cancelled. Leave blank to disable. |
| Paused escalation | 24 hours | Deployments paused longer than the configured Paused Escalation time trigger an alert + notification, then auto-cancel. Leave blank to disable. |
MW auto-skip
Hosts using the "Maintenance Window" install schedule that have been in "Pending" status for over 48 hours with no maintenance window configured are automatically skipped. This prevents misconfigured hosts from blocking an entire ring indefinitely.
Post-deployment
Completion summary
When all rings complete, the deployment builds a completion summary including per-ring success rates, failed host details, and total patches installed. The deployment status is:
- "Completed" -- all hosts succeeded
- "Completed with Failures" -- some hosts failed, timed out, or were skipped
Reboot handling
Reboot behavior is per-ring. After patches install, the reboot policy determines when the host reboots. If set to "Maintenance Window", the reboot is scheduled for the next MW. If set to "Manual", the operator is responsible for rebooting.
Deployment lifecycle flow
Permissions reference
| Permission | Grants |
|---|---|
| patches.view | View policies, ring sets, available patches, deployments, maintenance windows |
| patches.manage | Create/update/delete policies, ring sets, deployments. Approve/reject patches. Start/pause/cancel deployments. Manage maintenance windows. |
| patches.deploy | Skip individual deployment hosts |
| hosts.view | View service groups |
| hosts.manage | Create/update/delete service groups and tiers |
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| Patches not appearing | Agent not sending pending patch data | Check agent version is 1.7.0 or later, verify heartbeat is active |
| Auto-approve not working | No active policy, or classification mismatch | Verify the policy is enabled, check classification match |
| Host stuck in "Pending" | Not in maintenance window, or canary gate not cleared | Check MW targeting. Check canary status on deployment detail. |
| Host stuck in "Scheduled" | MW has not opened yet | Check the scheduled time on the host detail. Verify MW schedule with preview. |
| Deployment stuck | Ring hosts in transitional state | Engine tick is 300s. Check host error messages. Orphan recovery runs automatically. |
| "Variance Approval Required" | Higher ring has untested patches | Approve variance on deployment detail, or disable "Block Untested Patches" in the policy |
| Circuit breaker triggered | 3+ failures for same KB across account | Review failure reasons. Manually re-approve KB after fixing root cause. |
| Host blocked | Pre-flight detected condition (disk space) | Resolve the condition. System auto-retries when alert resolves. |
| MW times wrong | Timezone mismatch | Verify the maintenance window timezone matches the expected timezone |
| Host stuck in "Downloading" | Download job completed but callback missed | Orphan recovery detects this. Wait for the next engine cycle (approximately 5 minutes). |
| Host auto-skipped after 48h | No maintenance window configured | Assign a maintenance window to the host's group, or use the "Immediate" install schedule |
| "Completed with Failures" | Some hosts failed/timed out/skipped | Review failed hosts on deployment detail. Retry or investigate. |
| Service group tier not advancing | Prior tier incomplete or failed success gate | Check prior tier hosts. All must be in terminal state. |
| "Deployment auto-cancelled" | Exceeded Max Duration or Paused Escalation time limit | Increase limits, or investigate why it is taking too long / why it is paused. |