Prerequisites
- User role with agent_updates.view (read) or agent_updates.deploy (deploy/cancel/rollback)
- User role with software.view (read) or software.manage (write/deploy) for software library
- Target hosts must be running the Go agent (not legacy), have an active agent, and not be in decommissioned status
- Agent version must have an active or completed publication for your account
Understanding Agent Versions
SPOG runs two agent types. Only the Go agent supports automated updates.
| Agent | Language | Auto-Update | Notes |
|---|---|---|---|
| Go Agent | Go 1.21+ | Yes | Primary agent. Ed25519 signed. Zero-downtime update via dual-process switchover. |
| Legacy Agent | C | No | Reports as legacy-X.Y.Z. Must be manually replaced. Excluded from rollouts. |
Agent Update Lifecycle
The update process uses a dual-process approach for zero-downtime updates. The old agent launches the new binary as a separate process, the backend verifies the new agent is healthy, and only then switches over.
Update State Machine
What Happens at Each Stage
Deploying Agent Updates (Manual)
- Navigate to Agent Updates > Versions to view available versions. Filter to stable versions only for production environments.
- Select a version and click Deploy. Choose the target hosts and set the priority.
- The version must have an active or completed publication for your account.
- Each host receives an update job. Hosts are automatically skipped if:
- Already on the target version
- Have a pending update in progress
- No binary available for their OS/architecture
- The deployment summary shows which hosts received update jobs and which were skipped (with reasons).
Wave-Based Rollout (Publications)
Platform operators create publications that roll out a version across accounts in waves. Publications are read-only for tenants — you can view progress but not create or modify them.
Rollout Waves
The automated rollout engine runs every 60 seconds and processes active publications:
- Canary wave — small batch of hosts (typically 1-5%) to validate the update.
- Early adopter wave — broader batch, still limited.
- Broad wave — majority of hosts.
- Full rollout — remaining hosts.
Between waves, the engine evaluates the failure rate. If it exceeds the configured threshold, the publication is paused and requires operator intervention to resume.
Publication State Machine
Viewing Publications
Navigate to Agent Updates > Publications to view rollouts targeting your account. Each shows: version, status, and progress (total targets, updated, failed, in progress, current wave).
Monitoring Rollout Progress
Update Jobs
- Navigate to Agent Updates > Update Jobs to view all update jobs. Filter by status or host.
- Each job shows: source version, target version, current status, timestamps for each phase, and success/failure result.
Aggregate Statistics
- The Update Statistics view shows success rate and counts by status.
- The Version Distribution view shows host counts per agent version across your fleet.
Cancel or Rollback
| Action | How | Constraint |
|---|---|---|
| Cancel | Click Cancel on an update job | Only pending or queued updates. No agent-side action needed. |
| Force Rollback | Click Rollback on an update job | Cannot rollback a completed update (old agent already shut down). |
Software Inventory
The software library is a global catalog for managing third-party software deployed to hosts.
Catalog Management
- Navigate to Software Library > Create Entry. Enter vendor, name, description, license type, and product family. The vendor + name combination must be unique.
- Add versions to the software entry. Marking a version as "Latest" automatically clears the flag from the previous latest version.
- Add installers per version, specifying: OS type, architecture, installer URL, SHA hash, silent install arguments, and whether a reboot is required.
Software Deployment
- Select a software version and click Deploy. Choose the target hosts and set the priority.
- Optionally select a specific installer to override auto-selection. Otherwise, the system automatically matches the installer to each host's operating system.
- Install jobs are created with automatic retry (2 retries, 5-minute delay between attempts).
- The agent downloads the installer and runs it silently with the configured install arguments.
License Tracking
- Navigate to the software entry and click Add License to create a license record (scoped to an organization).
- Track total licenses, used licenses, expiration date, and cost.
- Available licenses are computed automatically (total minus used). Expired status is derived from the expiration date.
Permissions Reference
| Permission | Grants |
|---|---|
| agent_updates.view | List/get versions, binaries, update jobs, publications, statistics. |
| agent_updates.deploy | Deploy versions to hosts, cancel pending updates, force rollback. |
| software.view | List/get software catalog, versions, installers, licenses. |
| software.manage | Create/update/delete software entries, versions, installers, licenses. Deploy software to hosts. |
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| Version not visible | No publication for your account | Contact platform admin to publish the version to your account. |
| Deploy returns 403 | Version not published to account | Verify a publication exists and is active or completed for your account. |
| All hosts skipped | No binary for OS/architecture | Upload a binary matching the target hosts' platform. |
| Update stuck in downloading | Agent offline or binary URL unreachable | Verify host is online and can reach the backend. |
| Update stuck in health_checking | New agent not heartbeating | Check if the new agent process is running on the host. Review agent logs. |
| Publication paused | Wave failure rate exceeded threshold | Review failed hosts, fix issues, then resume the publication. |
| Rollback fails | Update already completed | Cannot rollback — the old agent is already shut down. |
| Software deploy returns 400 | No installers for version | Add an installer for the target OS before deploying. |
| Software deploy skips hosts | Host OS doesn't match any installer | Verify installer os_type matches host os_type. |
| Legacy agents not updating | Legacy agents excluded from rollout | Legacy agents must be manually replaced with the Go agent. |
| Agent registration 422 error | Stale agent ID file on host | Delete /var/lib/spog-agent/agent_id and restart the agent. |
| Agent heartbeat 401 | Wrong organization secret | Verify organization_secret in agent config matches the org. |