sla-sli-tracker

提供服务可靠性保障的核心能力,通过定义可用性、延迟、错误率等关键指标,设定目标值并持续追踪实际表现,支持误差预算计算与服务水平协议的制定,帮助团队量化评估和主动优化系统稳定性。

快捷安装

在终端运行此命令,即可一键安装该 Skill 到您的 Claude 中

npx skills add jeremylongshore/claude-code-plugins-plus-skills --skill "sla-sli-tracker"

Overview

This skill provides a structured approach to defining and tracking SLAs, SLIs, and SLOs, which are essential for ensuring service reliability. It automates the process of setting performance targets and monitoring actual performance, enabling proactive identification and resolution of potential issues.

How It Works

  1. SLI Definition: The skill guides the user to define Service Level Indicators (SLIs) such as availability, latency, error rate, and throughput.
  2. SLO Target Setting: The skill assists in setting Service Level Objectives (SLOs) by establishing target values for the defined SLIs (e.g., 99.9% availability).
  3. SLA Establishment: The skill helps in formalizing Service Level Agreements (SLAs), which are customer-facing commitments based on the defined SLOs.

When to Use This Skill

This skill activates when you need to:

  • Define SLAs, SLIs, and SLOs for a service.
  • Track service performance against defined objectives.
  • Calculate error budgets based on SLOs.

Examples

Example 1: Defining SLOs for a New Service

User request: “Create SLOs for our new payment processing service.”

The skill will:

  1. Prompt the user to define SLIs (e.g., latency, error rate).
  2. Assist in setting target values for each SLI (e.g., p99 latency < 100ms, error rate < 0.01%).

Example 2: Tracking Availability

User request: “Track the availability SLI for the database service.”

The skill will:

  1. Guide the user in setting up the tracking of the availability SLI.
  2. Visualize availability performance against the defined SLO.

Best Practices

  • Granularity: Define SLIs that are specific and measurable.
  • Realism: Set SLOs that are challenging but achievable.
  • Alignment: Ensure SLAs align with the defined SLOs and business requirements.

Integration

This skill can be integrated with monitoring tools to automatically collect SLI data and track performance against SLOs. It can also be used in conjunction with alerting systems to trigger notifications when SLO violations occur.