Documentation//Site Reliability Engineering
🛤️

Site Reliability Engineering

Site Reliability Engineering discipline agent for reliability, monitoring, and incident response

Installation

First, install the Han CLI tool:

npm install -g @thebushidocollective/han

Then install the plugin:

han plugin install do-site-reliability-engineering

Overview

Discipline agent for site reliability engineering practices.

Overview

The SRE discipline embodies the principles of reliability, scalability, and operational excellence. This agent helps apply SRE practices to infrastructure and application management.

Agent

sre

An expert in site reliability engineering who helps with:

  • Designing reliable systems
  • Implementing monitoring and observability
  • Incident response and postmortems
  • Capacity planning and performance optimization
  • SLIs, SLOs, and error budgets

Skills Included

  • sre-monitoring: Monitoring, observability, and alerting practices
  • sre-incident-response: Incident management and postmortem processes
  • sre-reliability: Building reliable and scalable systems

When to Use

Use the SRE agent when:

  • Designing system reliability practices
  • Setting up monitoring and alerting
  • Responding to incidents
  • Conducting postmortems
  • Defining SLOs and error budgets
  • Planning capacity and performance
  • Implementing on-call practices

Agents

Skills