🛤️
Site Reliability Engineering
Site Reliability Engineering discipline agent for reliability, monitoring, and incident response
Installation
First, install the Han CLI tool:
npm install -g @thebushidocollective/hanThen install the plugin:
han plugin install do-site-reliability-engineeringOverview
Discipline agent for site reliability engineering practices.
Overview
The SRE discipline embodies the principles of reliability, scalability, and operational excellence. This agent helps apply SRE practices to infrastructure and application management.
Agent
sre
An expert in site reliability engineering who helps with:
- Designing reliable systems
- Implementing monitoring and observability
- Incident response and postmortems
- Capacity planning and performance optimization
- SLIs, SLOs, and error budgets
Skills Included
- sre-monitoring: Monitoring, observability, and alerting practices
- sre-incident-response: Incident management and postmortem processes
- sre-reliability: Building reliable and scalable systems
When to Use
Use the SRE agent when:
- Designing system reliability practices
- Setting up monitoring and alerting
- Responding to incidents
- Conducting postmortems
- Defining SLOs and error budgets
- Planning capacity and performance
- Implementing on-call practices
Agents
Skills
📖
sre-incident-response
Use when responding to production incidents following SRE principles and best practices.
📖
sre-monitoring-and-observability
Use when building comprehensive monitoring and observability systems.
📖
sre-reliability-engineering
Use when building reliable and scalable distributed systems.