Host & Agent Management
Enrollment, host lifecycle, agent health, and the workflows teams rely on to keep managed endpoints trustworthy.
Scope
If enrollment and host ownership are fuzzy, every downstream operation becomes harder to trust. This page keeps the operator-facing lifecycle model and strips private API or setup references.
Single Source of Truth — Cadres Host & Agent Operator Manual Covers: agent deployment, host viewing, host groups, remote operations, tags, key rotation, agent configuration
Deploying the Agent
Prerequisites
- Organization created with a secret
- Go agent binary for the target OS/architecture
Get the Organization Secret
Each organization has a unique secret used for agent authentication. Find it in the organization settings or via:
The secret field is the value the agent needs.
What Happens During Registration
The agent will: 3. Begin sending heartbeats every 60 seconds 4. Start collecting and reporting host information
Verifying Agent Health
After installation, verify the agent is communicating:
Viewing Hosts
List all hosts:
Get host details:
Returns comprehensive host information including: - System info (OS, kernel, architecture) - Hardware (CPU, RAM, manufacturer, serial) - Network interfaces and IPs - Services and their status - Local users and groups - Drives and storage - Security status (firewall, antivirus) - Installed patches and software
Host details now open on the Metrics tab by default so current health status is immediately visible. The Refresh Health action is shown only to users with hosts.manage.
a. Editing Host Metadata
You can update a host’s description, notes, and location assignment:
Only provided fields are updated (partial update). Omitted fields are left unchanged.
Updating location: The location must belong to the same organization as the host.
Requires hosts.manage permission.
Managing Host Groups
Host groups organize hosts for batch operations, monitoring, and RBAC scoping.
Create a host group:
Add hosts to group:
Remove hosts from group:
The dependency graph is mixed-mode: manual edges can link host-to-host, host-to-group, or group-to-group records. The topology view also surfaces standalone hosts that only appear through dependencies.
When creating a dependency, select “Host” or “Group” for source and target types. The selector dropdown switches between actual hosts and host groups accordingly. Use the search box above each dropdown to filter by name in large environments. The backend validates that all IDs reference the correct entity type before saving.
Remote Host Operations
All remote operations are dispatched as agent jobs. The backend queues the job, and the agent picks it up on its next poll.
Service control:
Process control:
Actions: kill, force_kill
Network diagnostics:
Diagnostics polling automatically retries transient status-fetch failures. If polling still cannot recover, the page shows an inline error with a Retry Polling action.
User management:
Requires hosts.manage_users permission.
a. Remote Access Sessions
The platform provides three remote access channels: terminal, file browser, and desktop. All use WebSocket connections with first-message JWT authentication (tokens are never passed in URL query parameters).
When the UI and API are hosted on different origins, clients must open these WebSockets against the API origin, not the current page origin.
PAM checkouts can launch into these same Host Details and Remote Desktop flows. That handoff keeps the checkout token in in-memory navigation state rather than the URL, and the normal remote-access readiness contract still decides whether the session can start.
The Host Details page now uses the backend-provided remote_access_status contract to decide whether Terminal, Files, and Desktop are available. A host being online is not enough by itself. The UI shows explicit blocked reasons when permission, feature flags, host feature overrides, tunnel connectivity, or agent capability declaration prevent a channel from starting.
Terminal
Credential modes:
- agent_user (default): Run as the agent’s system user
- su: Switch to a specified user (credentials message sent after auth)
- pam_checkout: Use a checked-out PAM credential (include pam_session_token in auth message). The checked-out username/password is used to launch the shell under that identity on the agent, not as the agent service user.
The browser does not treat the terminal as connected on WebSocket open alone. It waits for explicit session readiness from the backend/agent path before showing the terminal as live. If startup, credential switch, timeout, or tunnel teardown fails, the operator-facing error identifies the real failing boundary instead of collapsing to a generic disconnect.
Sessions auto-close after 8 hours (max duration) or 30 minutes of inactivity.
File Browser
Capabilities: Directory listing, file read/write, upload/download, create/delete/rename. Blocks access to sensitive files (the relevant workflow, private keys).
Sessions auto-close after 4 hours (max duration) or 15 minutes of inactivity.
Desktop
The desktop path does not silently downgrade a requested rdp session to console, and it does not cosmetically rewrite desktop auth modes to another name. Unsupported or runtime-blocked requests fail closed with the agent-advertised reason surfaced to the operator.
The browser also keeps the session in connecting state until the agent confirms desktop_ready. If the agent reports a different desktop mode than the one requested, the UI treats that as a fatal contract violation and immediately closes the session instead of continuing under the wrong mode.
Linux desktop console can also report non-fatal runtime warnings. When the readiness contract includes input_available = false, capture is available but keyboard or mouse injection is not; the Host Details flow warns before connect and the resulting session is view-only. When X11/display access itself is missing, desktop stays blocked with an explicit runtime-prerequisite message instead of falling back or pretending the session can start.
Linux rdp / New Session is now conditionally shipped for prepared Tier 1 hosts. The mode stays fail-closed unless org/account policy enables Linux multi-session prep, the host has completed the explicit prep/install workflow, and the agent proves both XRDP session primitives and helper launch readiness. When one of those prerequisites is missing, the platform surfaces explicit blockers:
- linux_multisession_toggle_disabled — org/account policy has not enabled Linux multi-session prep.
- linux_multisession_prep_not_installed — required prep/install workflow has not completed on the host.
- linux_multisession_helper_or_session_unavailable — XRDP helper/session primitives are not ready.
Sessions auto-close after 8 hours (max duration) or 30 minutes of inactivity.
Session Recordings
Terminal sessions produce asciicast v2 recordings. Desktop sessions produce binary .cadresdr recordings. Both are accessible via:
Host Tags
Tags are key-value pairs for custom classification.
Add a tag:
Search by tag:
Understanding Host Health
The system provides several health indicators at different levels. Here is what each one means and when to pay attention.
Status (Connectivity)
The status field on each host tells you whether the agent is currently communicating:
| Value | What It Means | Action Needed? |
|---|---|---|
online |
Agent sent a heartbeat within the last 5 minutes | No — operating normally |
offline |
No heartbeat for more than 5 minutes | Yes — check network connectivity, agent process, or host availability |
maintenance |
Host is in a scheduled maintenance window | No — expected downtime |
warning |
Agent reported a warning condition | Investigate — the agent flagged something unusual |
decommissioned |
Host removed from active management | No — intentionally retired |
Health Status (Heartbeat Freshness)
The health_status field gives a finer-grained view of heartbeat freshness:
| Value | What It Means | When to Worry |
|---|---|---|
healthy |
Heartbeat received within 5 minutes | Not at all |
unhealthy |
No heartbeat for 5 minutes to 7 days | Moderate — host may be down or disconnected |
stale |
No heartbeat for over 7 days | High — host is likely decommissioned or permanently unreachable |
unknown |
Agent has never sent a heartbeat | Check if the agent installed and started correctly |
Health Score and Tier (Composite Assessment)
Each host has a composite health_score (0–100) that combines five signals:
- Connectivity (25%): How recently the host heartbeated
- Disk health (20%): Free disk space across all drives
- Patch compliance (20%): Whether approved patches have been installed
- Service health (15%): Whether auto-start services are running
- Alert penalty (20%): Active alerts reduce the score (critical alerts deduct more than medium)
The score maps to a health_tier:
- healthy (80–100): Host is in good shape
- degraded (60–79): One or more signals need attention
- critical (0–59): Multiple problems detected — investigate immediately
Maintenance window awareness (G-04): Hosts that are currently in an active maintenance window will not be penalized for having a stale heartbeat. The connectivity factor is automatically suppressed to 100 during planned maintenance, so a host that is intentionally offline for patching or updates will not show a degraded health score. This is auto-detected via core/maintenance_utils.host_is_in_maintenance_window(). For batch scoring (fleet-level), a single query via get_hosts_in_maintenance_window() checks all hosts at once to avoid N+1 queries.
Fleet Health Percentage
- What percentage of agents are online (40% weight)
- How many critical/warning alerts are active (30% weight)
- How many drift events are open (20% weight)
- How many patch deployments have failed (10% weight)
A fleet health below 80% warrants investigation across your managed hosts.
Bulk Operations
The Hosts page supports multi-select for batch operations across multiple hosts. Select hosts using the checkboxes, then choose an action:
Available Bulk Actions:
- Run Script: Execute a saved script on all selected hosts
- Run Command: Execute an ad-hoc command on all selected hosts
- Service Control: Start, stop, restart, enable, or disable a service on selected hosts
- Install Software: Install a package on selected hosts
- Assign Group: Add selected hosts to a host group
- Assign Fingerprint Policy: Apply a fingerprint baseline policy
- Delete Selected: Permanently remove selected hosts (requires
hosts.managepermission)
Remote Access Readiness
Operator-visible blocked states include:
- permission denied
- feature disabled for the organization or host
- tunnel disconnected
- capability manifest missing
- unsupported channel, mode, or credential mode
- runtime prerequisite missing
- Linux multi-session toggle/prep/session blockers:
linux_multisession_toggle_disabledlinux_multisession_prep_not_installedlinux_multisession_unsupported_distro_package_managerlinux_multisession_helper_or_session_unavailable
When the agent can provide a concrete runtime blocker, the UI now shows that detail directly instead of a generic connection error.
Desktop availability is now driven by the agent-advertised mode/auth matrix inside remote_access_status, not just by backend OS inference. If the agent says a desktop mode or credential mode is unsupported, the Host Details page and the backend both fail closed on that exact path.
Linux desktop console requires an accessible X11 display. The agent could not find /tmp/.X11-unix/X0 on this host.
If the display is present but input helpers are unavailable, desktop may remain available in view-only mode. In that case the connection modal warns before connect and the desktop session itself repeats the runtime warning so the operator knows capture works but input injection does not.
XRDP teardown is best-effort. The agent stops the desktop helper and verifies whether the XRDP session still exists, but upstream xrdp-sesadmin kill:sid remains unimplemented, so lingering XRDP sessions should be treated as an operator-visible runtime follow-up rather than a guaranteed automatic cleanup.
If you see that message, use terminal or file browser instead, or start/restore the host’s graphical session before retrying desktop.
Agent Key Rotation
If an agent’s Ed25519 signing key needs to be rotated (compromise, periodic rotation):
This updates the stored public key. The agent must already be using the new key for subsequent requests.
Plan Limit — Agent Registration
- Contact the account admin to upgrade the subscription plan
- Or decommission unused hosts to free up capacity
See saas-portal.md for full plan limit details.
PAM Enrollment from Host Detail
Discovered local user accounts can be enrolled into the PAM vault directly from the host detail view:
This creates a PAM identity for the local account and links it to the specified vault (identity group). Requires both hosts.manage and pam_vaults.manage permissions.
Software Reconciliation
View software reconciliation analysis for a host (compare installed software against the software library):
Returns authorized, unauthorized, and untracked software on the host. Requires software.view permission.
On-Demand Compliance Check
Trigger a compliance check for a specific host without waiting for the scheduled scan:
Requires compliance.manage permission. Returns the scan results immediately.
OOB Auto-Detection
When an agent reports host information, the system automatically detects out-of-band management interfaces based on the hardware manufacturer:
- Dell servers get iDRAC assigned
- HP/HPE servers get iLO assigned
- Supermicro servers get assigned
- Lenovo servers get XCC assigned
- Cisco servers get assigned
Cross-References
| Topic | Document |
|---|---|
| Getting started | getting-started.md |
| Organization management | organization-management.md |
| Roles & permissions | roles-permissions.md |
| Troubleshooting | troubleshooting-core.md |
| Agent migration (legacy to Go) | agent-migration.md |
| Host & agent architecture | docs/architecture/host-agent-management.md |
| Host & agent functional specs | docs/functional/host-agent-management.md |