AgentSkillsCN

Alerts

当您需要处理Atera告警时,可使用此技能:查看、确认、解决或管理来自受监控设备的告警信息。本技能覆盖告警类型、严重程度分级、告警来源,以及告警到工单的自动流转机制,是MSP通过Atera进行监控运维不可或缺的利器。

SKILL.md
--- frontmatter
description: >
  Use this skill when working with Atera alerts - viewing, acknowledging,
  resolving, or managing alerts from monitored devices. Covers alert types,
  severity levels, alert sources, and alert-to-ticket conversion.
  Essential for MSP monitoring operations through Atera.
triggers:
  - atera alert
  - rmm alert
  - monitoring alert
  - alert severity
  - alert acknowledge
  - alert resolve
  - device alert
  - threshold alert
  - atera monitoring

Atera Alert Management

Overview

Alerts in Atera are notifications generated when monitored systems exceed defined thresholds or encounter issues. They serve as the early warning system for MSPs, enabling proactive response to client issues before they become critical problems.

Alert Severity Levels

SeverityDescriptionTypical Response
CriticalImmediate action requiredRespond within 15 minutes
WarningAttention needed soonRespond within 1 hour
InformationFYI, no action requiredReview during normal hours

Alert Sources

SourceDescription
AgentAlerts from RMM agent monitoring
DeviceAlerts from HTTP/SNMP/TCP monitors
ThresholdAlerts when metrics exceed limits
CustomUser-defined or API-created alerts

Alert Types

TypeDescriptionCommon Triggers
AvailabilityDevice/service up/downAgent offline, ping failure
PerformanceResource utilizationHigh CPU, low memory, disk full
HardwarePhysical component issuesSMART errors, temperature
SecuritySecurity-related eventsFailed logins, malware detected
ApplicationSoftware issuesService stopped, event log errors
PatchUpdate statusMissing patches, update failures
BackupBackup statusBackup failed, missed schedule

Alert Fields

Core Fields

FieldTypeDescription
AlertIDintUnique alert identifier
CodeintAlert type code
SourcestringAlert source (Agent, Device, etc.)
TitlestringAlert title/summary
SeveritystringCritical, Warning, Information
CreateddatetimeWhen alert was generated
SnoozedEndDatedatetimeSnooze expiration (if snoozed)
DeviceGuidstringAssociated device GUID
AdditionalInfostringExtra context/details
ArchivedbooleanWhether alert is archived
AlertCategoryIDstringCategory classification
ArchivedDatedatetimeWhen alert was archived
TicketIDintLinked ticket (if converted)
AlertMessagestringDetailed alert message
FolderIDintFolder/group reference

Device/Customer Fields

FieldTypeDescription
CustomerIDintAssociated customer ID
CustomerNamestringCustomer display name
DeviceNamestringDevice hostname

API Patterns

List All Alerts (Paginated)

http
GET /api/v3/alerts?page=1&itemsInPage=50
X-API-KEY: {api_key}

Response:

json
{
  "items": [
    {
      "AlertID": 111111,
      "Code": 205,
      "Source": "Agent",
      "Title": "High CPU Usage",
      "Severity": "Warning",
      "Created": "2024-02-15T10:30:00Z",
      "CustomerID": 12345,
      "CustomerName": "Acme Corporation",
      "DeviceName": "SERVER-DC01",
      "DeviceGuid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "AlertMessage": "CPU usage exceeded 90% for 5 minutes",
      "Archived": false,
      "TicketID": null
    }
  ],
  "totalItems": 250,
  "page": 1,
  "itemsInPage": 50,
  "totalPages": 5
}

Get Alert by ID

http
GET /api/v3/alerts/{alertId}
X-API-KEY: {api_key}

Response:

json
{
  "AlertID": 111111,
  "Code": 205,
  "Source": "Agent",
  "Title": "High CPU Usage",
  "Severity": "Warning",
  "Created": "2024-02-15T10:30:00Z",
  "CustomerID": 12345,
  "CustomerName": "Acme Corporation",
  "DeviceName": "SERVER-DC01",
  "DeviceGuid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "AlertMessage": "CPU usage has exceeded 90% threshold\n\nCurrent Value: 95%\nThreshold: 90%\nDuration: 5 minutes",
  "AdditionalInfo": "Process: sqlservr.exe consuming 85% CPU",
  "Archived": false,
  "ArchivedDate": null,
  "TicketID": null,
  "SnoozedEndDate": null,
  "AlertCategoryID": "performance"
}

Create Alert (API-Generated)

http
POST /api/v3/alerts
X-API-KEY: {api_key}
Content-Type: application/json
json
{
  "DeviceGuid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "Title": "Custom Alert - Database Connection Pool Exhausted",
  "Severity": "Critical",
  "AlertMessage": "Application database connection pool at 100% capacity. New connections being rejected.",
  "AlertCategoryID": "application"
}

Response:

json
{
  "ActionID": 111112,
  "AlertID": 111112
}

Delete/Resolve Alert

http
DELETE /api/v3/alerts/{alertId}
X-API-KEY: {api_key}

Response:

json
{
  "ActionID": 111111,
  "Success": true
}

Alert to Ticket Conversion

Workflow

  1. Review alert - Understand the issue
  2. Create ticket - Link to alert for context
  3. Assign technician - Route for resolution
  4. Resolve issue - Fix the underlying problem
  5. Close alert - Delete or archive when resolved

Create Ticket from Alert

http
POST /api/v3/tickets
X-API-KEY: {api_key}
Content-Type: application/json
json
{
  "TicketTitle": "Alert: High CPU Usage on SERVER-DC01",
  "Description": "Alert ID: 111111\n\nCPU usage has exceeded 90% threshold\n\nCurrent Value: 95%\nThreshold: 90%\nDuration: 5 minutes\n\nProcess: sqlservr.exe consuming 85% CPU",
  "CustomerID": 12345,
  "TicketPriority": "High",
  "TicketType": "Problem"
}

Alert Categorization

Performance Alerts

AlertThresholdSeverityAction
High CPU> 90% for 5 minWarningInvestigate processes
High Memory> 95%WarningCheck for leaks
Disk Space Low< 10% freeCriticalClean or expand
Disk Space Warning< 20% freeWarningPlan cleanup

Availability Alerts

AlertConditionSeverityAction
Agent OfflineNo heartbeat 10 minCriticalCheck connectivity
Service StoppedCritical service downCriticalRestart service
Ping FailureHost unreachableCriticalCheck network

Security Alerts

AlertTriggerSeverityAction
Failed Logins> 5 failuresWarningInvestigate
Malware DetectedAV detectionCriticalQuarantine
Firewall DisabledWindows Firewall offWarningRe-enable

Common Workflows

Alert Triage Process

  1. Review new alerts - Sort by severity
  2. Assess impact - Determine business effect
  3. Prioritize response - Critical first
  4. Take action - Resolve or escalate
  5. Document - Create ticket if needed
  6. Close - Delete resolved alerts

Alert Suppression

When an alert is expected (maintenance, known issue):

  1. Snooze alert - Temporarily suppress
  2. Set duration - Define snooze period
  3. Document reason - Note why suppressed
  4. Review after - Verify issue resolved

Bulk Alert Management

For multiple alerts from same issue:

  1. Identify root cause - Find common source
  2. Create single ticket - Link all related alerts
  3. Resolve root cause - Fix underlying issue
  4. Bulk delete - Remove all related alerts

Error Handling

Common API Errors

CodeMessageResolution
400Invalid alert IDVerify alert exists
401UnauthorizedCheck API key
403ForbiddenVerify permissions
404Alert not foundConfirm alert ID
429Rate limitedWait and retry (700 req/min)

Alert Processing Errors

ErrorCauseResolution
Device not foundInvalid DeviceGuidVerify device exists
Invalid severityTypo in severityUse Critical, Warning, Information
Missing required fieldIncomplete requestAdd required fields

Best Practices

  1. Respond quickly to critical alerts - Time is essential
  2. Set appropriate thresholds - Avoid alert fatigue
  3. Review alert patterns - Identify recurring issues
  4. Convert to tickets - Track resolution formally
  5. Document resolutions - Build knowledge base
  6. Tune alert profiles - Reduce false positives
  7. Use severity appropriately - Reserve Critical for urgent issues
  8. Archive rather than delete - Maintain history for trends

Alert Monitoring Dashboard

Key Metrics to Track

MetricPurpose
Open alerts by severityCurrent workload
Alert volume trendIdentify patterns
Mean time to acknowledgeResponse efficiency
Mean time to resolveResolution efficiency
Top alerting devicesProblem systems
Alert to ticket ratioConversion rate

Related Skills