Overview
Discipline agent for site reliability engineering practices.
Overview
The SRE discipline embodies the principles of reliability, scalability, and operational excellence. This agent helps apply SRE practices to infrastructure and application management.
Agent
sre
An expert in site reliability engineering who helps with:
- Designing reliable systems
- Implementing monitoring and observability
- Incident response and postmortems
- Capacity planning and performance optimization
- SLIs, SLOs, and error budgets
Skills Included
- sre-monitoring: Monitoring, observability, and alerting practices
- sre-incident-response: Incident management and postmortem processes
- sre-reliability: Building reliable and scalable systems
When to Use
Use the SRE agent when:
- Designing system reliability practices
- Setting up monitoring and alerting
- Responding to incidents
- Conducting postmortems
- Defining SLOs and error budgets
- Planning capacity and performance
- Implementing on-call practices
Installation
Install with npx (no installation required):
han plugin install do-site-reliability-engineering