o11y-metrics-prometheus-ops
Set up Prometheus for comprehensive metric collection, storage, and monitoring of infrastructure and applications. Use when implementing metrics collection, setting up monitoring infrastructure, or configuring alerting systems.
create-hat-collection
Generates new Ralph hat collection presets through guided conversation. Asks clarifying questions, validates against schema constraints, and outputs production-ready YAML files.
logging-observability
Logging and observability best practices — structured logging, log levels, correlation IDs, metrics, tracing, and alerting. Reference when implementing logging or monitoring.
grafana-prometheus
Observability and monitoring with Prometheus metrics and Grafana dashboards
l10n-migration
Migrate from mcamara/laravel-localization to goodcat/laravel-l10n.
system-management
Use when designing observability, testing, debugging, and operational control for messaging systems based on Enterprise Integration Patterns (Hohpe & Woolf). USE FOR: messaging observability, wire tap, control bus, message history, message store, monitoring messaging systems, testing message flows, debugging async systems DO NOT USE FOR: message routing (use message-routing), consumer patterns (use messaging-endpoints)
communitytoolkit-guard
USE FOR: Writing concise, consistent guard clauses using CommunityToolkit.Diagnostics for argument validation, null checks, range checks, and string validation in constructors, methods, and factory methods. DO NOT USE FOR: Complex business rule validation (use FluentValidation or Peasy), user-facing form validation with error messages (use DataAnnotations), or replacing domain-level parsing/validation (use Parse Don't Validate pattern).
logging-monitoring
Use when designing security logging, monitoring, and incident detection capabilities. Covers SIEM architecture, audit trail requirements, security event correlation, and compliance logging for GDPR, PCI DSS, HIPAA, and SOX. USE FOR: SIEM, security logging, audit trails, security monitoring, incident detection, log aggregation, security event correlation, compliance logging, intrusion detection DO NOT USE FOR: application performance monitoring (use observability skills), general logging frameworks (use logging skills), incident response procedures (use secure-sdlc)
kubernetes
Kubernetes container orchestration with Helm, operators, and service mesh. Use for cluster management.
prometheus
Prometheus monitoring and alerting with PromQL. Use for metrics collection.
beeline-migration
Step-by-step guide for migrating from Honeycomb Beelines (End of Life) to OpenTelemetry instrumentation. Trigger phrases: "migrate from Beelines", "upgrade from Beeline to OpenTelemetry", "migrate to OTel", "replace Beelines", "Beeline end of life", "Beeline EOL", "switch from Beeline to OTel", "migrate Go Beeline", "migrate Python Beeline", "migrate Node Beeline", "migrate Java Beeline", "migrate Ruby Beeline", "W3C trace headers", "W3C propagation", "incremental migration to OpenTelemetry", or any request about migrating from Honeycomb Beelines to OpenTelemetry SDKs.
otel-instrumentation
Provides guidance on OpenTelemetry SDK setup, custom instrumentation, and sending data to Honeycomb. Trigger phrases: "instrument my app", "add tracing", "set up OpenTelemetry", "configure OTel", "add custom spans", "add attributes to spans", "send traces to Honeycomb", "set up OTLP", "configure sampling", "add span events", "add span links", "set up tracing for [any language]", "configure the OTel Collector", or any request about OpenTelemetry SDK setup, custom instrumentation, or sending data to Honeycomb.
prd-v05-risk-discovery-interview
Surface risks through guided questioning, helping users consider pivots, constraints, and prioritization during PRD v0.5 Red Team Review. Triggers on requests to identify risks, stress-test the idea, perform red team review, or when user asks "what could go wrong?", "identify risks", "red team", "risk assessment", "challenge assumptions", "stress test the idea". Consumes all prior IDs (CFD-, BR-, FEA-, PER-, UJ-, SCR-) as interview context. Outputs RISK- entries with owner decisions and mitigations. Feeds v0.5 Technical Stack Selection.
kpi-dashboard-design
Design effective KPI dashboards with metrics selection, visualization best practices, and real-time monitoring patterns. Use when building business dashboards, selecting metrics, or designing data visualization layouts.
github-agile
Diagnose GitHub-driven agile workflow problems and guide feature branch development
reliability-strategy-builder
Implements reliability patterns including circuit breakers, retries, fallbacks, bulkheads, and SLO definitions. Provides failure mode analysis and incident response plans. Use for "SRE", "reliability", "resilience", or "failure handling".
pr-commit-workflow
This skill should be used when creating commits or pull requests, enforcing a human-written PR structure, intent capture, and evidence in agentic workflows.
sla-monitor-generator
Generate SLA/SLO/SLI monitoring configurations for reliability tracking and error budget management. Activates for SLO setup, reliability targets, and error budget configuration.