IT Service & Operations Manual

ITSM

Incidents, problems, changes, service workflows, SLAs, and the operating structure teams use to run support and operations together.

Audience: Service desk and IT operations teamsFocus: Service management workflowStatus: Public manual

Scope

ITSM inside RMM is about making service work operationally legible, not just opening and closing tickets. This public guide keeps the workflow model and removes private endpoint and route detail.

Domain: Incidents, Problems, Changes, Service Catalog, SLA, Escalation, Assignment Groups

See also: Automation Manual | Jobs & Scripts Manual

Incident Management

Creating Incidents

Automatic creation: Incidents are created automatically by: - Alert correlation (metric alerts from alert_engine.py) - Event alerts (from event_alert_adapter.py) - Cross-host storm detection (3+ hosts, same rule, within 5 minutes)

Manual creation:

Incident Fields

All incidents are automatically enriched with: - Category/subcategory: Based on event source or metric type. - Impact: Derived from severity, escalated by host count and criticality. - Urgency: Mapped from severity. - Priority: P1-P4 from the ITIL matrix. - SLA: Assigned from host group SLA or org default. - Escalation policy: Assigned from org default or account-wide default.

Assigning Incidents

Via the assign endpoint:

You can assign to a user, an IDP user, an assignment group, or any combination. Send all fields as null to unassign.

Incidents can also be assigned: - Auto-assigned by escalation policy stages. - By updating fields directly on the incident.

Escalating Incidents

Automatic escalation: - SLA breach triggers auto-escalation (increments escalation_level). - Escalation policy sweep auto-advances stages on timeout.

Manual escalation:

Acknowledging Escalation Stages

  • Acknowledges the current escalation stage.
  • Prevents auto-advancement for stages with requires_acknowledgment = True.
  • Assigns the acknowledging user.
  • Rejects terminal incidents (resolved or closed).

Bulk Incident Actions

Actions: acknowledge or resolve. The comment is appended to each incident’s notes. Returns a count of affected incidents vs total requested.

Incident Notes

Add notes to track investigation progress and communications:

Only the note author or an account admin can edit/delete notes.

Time-tracking and CSAT metadata are stored separately from operator notes and do not appear in this list endpoint.

Incident Time Tracking

Log technician effort directly on incidents:

Example create request:

The incident-detail page now includes a dedicated Time tab with:

  • Logged-entry totals and billable vs non-billable rollups
  • A structured create form for operators with incidents.manage
  • Read-only review for operators who can view incidents but cannot manage them
  • Inline loading, empty, and retryable error states on the tab itself

Incident CSAT

Collect customer feedback after closure:

CSAT submission is allowed only when incident status is resolved or closed. Submitting again from the same actor updates that actor’s previous response.

The incident-detail page now includes a dedicated tab with:

  • Average rating, response-count, and breakdown summary cards
  • Response history directly on the incident
  • A submit/update form that only appears after the incident is resolved or closed
  • Inline loading, empty, and retryable error states on the tab itself

Incident Timeline

Promoting Incidents to Knowledge Base

After resolving an incident, promote the resolution to a KB article:

Creating Emergency Changes from Incidents

When an active incident requires an immediate change:

This creates an auto-approved emergency change record linked to the incident. PIR is mandatory and due within 72 hours. An event alert is fired for the emergency change creation.

Resolving Incidents

  • resolution_code: resolved, workaround, duplicate, not_reproducible, by_design.
  • resolution_notes: Free text explanation.

Re-opening Resolved Incidents (G-08)

Behavior: - Transitions resolved -> open. - If SLA definition is attached, re-activates timers (recalculated from reopen time). - Appending [Reopened] <notes> to incident notes if notes provided. - Requires incidents.manage permission.

Auto-Closing Stale Resolved Incidents (A-02)

The default ITSM policy auto-closes incidents that have remained resolved for 48 hours with no reopen.

You can trigger or preview policy runs explicitly with:

Optional fields: - stale_hours: policy window in hours (1-720). - dry_run: returns eligible IDs without mutating incidents.

Dry-run previews stay read-only even if the backend cannot map the current session to a mirrored local audit user; in that case, the system still writes a fallback actor-only audit entry with the Portal IDP metadata instead of blocking the sweep.

Problem Management

Creating Problems

Updating Problems

RCA Process

  1. Submit RCA: with:
  2. rca_summary: Root cause analysis text (min 10 chars).
  3. rca_confidence: Confidence level (0.0-1.0, optional).
  4. corrective_plan: List of corrective action items (optional).

  5. Approve: Creates ChangeRecord for each corrective plan item. Sets is_known_error = True. Transitions to change_created.

  6. Reject: Records review notes. Problem can be amended.

  7. Verify: with:

  8. verification_notes: Description of verification (min 10 chars).

Linking Incidents to Problems

  • Links propagate workaround availability to incidents.
  • Counters automatically updated (incident_count, affected_host_count).
  • Max 100 incident IDs per request.

Searching the Known Error Database (KEDB)

Assigning Problems to Groups

Change Management

Creating Changes

Standard changes default to auto-approval at creation, but governance policy gates can force CAB review when policy conditions are not met. Normal and emergency changes start in draft.

Frontend: The create form shows a “Governance Template” dropdown when templates exist. Selecting a template displays the maturity level badge, minimum notice hours, and required/optional gate chips below the dropdown.

CAB Approval Process

  1. Submit: – sends notifications to users with changes.approve permission.
  2. Review: CAB members review the change plan.
  3. Approve: – checks governance gates if template is set.
  4. Reject: with cab_notes.
  5. Rework (G-04): – returns rejected change to draft.
  6. Clears CAB decision, risk score, risk factors, and gate status.
  7. Reopens the linked ticket if it was resolved on rejection.
  8. After rework edits, re-submit via the relevant workflow.

Implementation

  1. Start: – sets actual_start.
  2. Complete: :
  3. If PIR required: transitions to review.
  4. If PIR not required: transitions to closed.

Post-Implementation Review (PIR)

with:

Governance Templates

When a governance template is assigned to a change: - Required gates must be completed before approval. - Notice hours must be satisfied (default 48 hours). - To override notice requirements, set notice_override_reason on the change record.

For standard changes, governance gates can also carry auto-approval policy rules (for example max risk score or allowed risk levels). If policy rejects auto-approval, the change is created in submitted state and routed to CAB.

Impact Analysis

: - Uses BFS traversal of the HostGroupDependency graph. - Returns affected hosts, groups, dependency chains, and risk level. - Stores results in impact_summary on the change record.

Risk Auto-Scoring (ITSM-11)

Every change record is automatically scored on a 1-25 risk scale using a 5x5 Impact x Likelihood matrix. The score is computed at creation and recalculated on update.

Impact factors (1-5): - Number of affected hosts: 0-1 hosts=1, 2-5=2, 6-20=3, 21-50=4, 51+=5 - Service tier criticality of host groups: critical=+2, high=+1 - Business hours (weekday 08:00-18:00 UTC): +1

Likelihood factors (1-5): - Change type: standard=1, normal=2, emergency=4 - Repeat change (same source_type + change_type closed before): -1

Score = Impact x Likelihood, clamped to [1, 25].

The response includes: - risk_score (integer 1-25)

Change Calendar

  • Returns unified view of scheduled changes, maintenance windows, and change freezes.
  • Color-coded: blue (changes), green (maintenance), red (freezes).
  • Max range: 366 days.

Service Catalog

Setting Up Catalog Items

Supported Field Types

Type Description Special Properties
text Single-line text input -
textarea Multi-line text area -
number Numeric input (validated) -
dropdown Select menu options: array of allowed values
radio Radio button group options: array of allowed values
checkbox Boolean toggle false is a valid value for required fields
hidden Not shown to user default_value: auto-injected into form_data
date Date picker (YYYY-MM-DD) -
email Email input (validated) -

Each field definition can include: - name (string, required): Machine-readable key used in form_data - label (string): Display label shown to users - type (string): One of the types above - required (boolean): Whether the field must be filled - default_value (string): Pre-populated value (critical for hidden type) - help_text (string): Explanatory text shown below the field

Using the Form Builder UI: When creating/editing catalog items, switch to the “Form Fields” tab to add fields visually. Use the type selector, configure options for dropdown/radio fields, and check the “Preview” tab to see the form as users will experience it.

Submitting Service Requests

Processing Requests

  • Approve:
  • Reject: with optional reason

Service Catalog UI Tabs

The Service Catalog page has four tabs:

  1. Browse Catalog — Browse available services grouped by category. Users can search and submit requests.
  2. My Requests — Shows only requests submitted by the current user (requester-scoped). Supports both legacy and IDP-backed identities via the mine=true query parameter. Includes status filter.
  3. Approval Queue — Visible only to users with service_requests.manage. Shows pending approval requests with approve/reject actions.
  4. Manage Items — Visible only to users with service_catalog.manage. Create, edit, and deactivate catalog items.

Permission gating in the request detail modal: - Mark Complete button: Visible only to users with service_requests.manage, and only when status is approved. - Read-only users see only a Close button.

Escalation Policies

Creating an Escalation Policy

Default Policy Assignment

The default escalation policy for an org (or account-wide) is automatically assigned to new incidents during SLA enrichment. Lookup order: 1. Org-specific default policy.

Assignment Groups

Creating an Assignment Group

Adding Members

Roles: - member: Standard team member. - lead: Receives escalation notifications and can reassign within the group.

Using Assignment Groups

Assignment groups are referenced by:

On-Call Schedule Management

Creating a Schedule

  • name: Schedule name (e.g., “Tier 1 Support On-Call”)
  • handoff_day: 0-6 (Mon-Sun) — only for weekly rotations
  • custom_rotation_hours: hours per shift — only for custom rotations

  • Permission required: incidents.manage

Managing Participants

  • Order determines who is on-call first (index 0 = first rotation slot)

Viewing Who’s On-Call

  • Optional ?at=2026-03-15T14:00:00 to query any point in time
  • Response includes source (“rotation” or “override”) and participant details
  • Query params are start and end (ISO datetime strings)
  • Returns list of blocks with start/end times, assigned user, and source
  • Maximum range: 90 days

Creating Overrides

When someone needs to cover another person’s shift: 2. Overrides always take precedence over the rotation 3. If multiple overrides overlap, the most recently created one wins

Linking to Assignment Groups

  1. When incidents route to that assignment group, the on-call person is automatically determined

Knowledge Base

Creating Articles

Articles start in draft status. Use markdown for content formatting.

Publishing Articles

  1. Create article (draft)

Published articles appear in search results and auto-suggestions.

Searching the Knowledge Base

Auto-Suggestion for Incidents

Article Feedback

Version History

SLA Holidays

Adding Holidays

Recurring holidays automatically repeat annually (same month/day).

Managing Holidays

Cross-References

Related Domain Manual
Automation / Workflows automation.md
Jobs & Scripts jobs-scripts.md

2026-04-12 Batch 06 Remediation Update

  • Problem-management mutations follow one permission contract: problems.manage.
  • Added incident time-entry APIs and report endpoint for technician effort tracking.
  • Added incident CSAT submit/summary/report APIs for post-resolution feedback.
  • Added requester-scoped customer portal request list/detail/summary endpoints.
  • Added SLA compliance reporting endpoint for operational KPI visibility.
  • Added on-demand incident-to-known-error auto-match endpoint and documented auto-match behavior.
  • Standard-change auto-approval is now policy-aware and can route to CAB review when governance policy requires it.