capture-repro
Internal specialist skill for Establish reproducible capture baselines and anchors.. Use when `rdc-debugger` dispatches capture-repro work.
Internal specialist skill for Establish reproducible capture baselines and anchors.. Use when `rdc-debugger` dispatches capture-repro work.
Internal specialist skill for Establish reproducible capture baselines and anchors.. Use when `rdc-debugger` dispatches capture-repro work.
Internal specialist skill for Establish reproducible capture baselines and anchors.. Use when `rdc-debugger` dispatches capture-repro work.
Instrument any external AI agent with OpenTelemetry to send traces, logs, and metrics to DataRobot for monitoring, observability, and governance.
Use when setting up monitoring systems, logging, metrics, tracing, or alerting. Invoke for dashboards, Prometheus/Grafana, load testing, profiling, capacity planning.
Set up Loki or ELK log aggregation for K8s workloads — structured logging, log routing, and log-based alerting.
Implement distributed tracing with OpenTelemetry, Tempo/Jaeger — instrumentation, sampling, and trace-to-log correlation.
Write production-quality Prometheus alert rules, recording rules, and Alertmanager routing configs.
Implement structured logging, distributed tracing, and metrics for production-ready backend services.
Define and track SLAs, SLIs, and SLOs for service reliability including availability, latency, and error rates. Use when establishing reliability targets or monitoring service health. Trigger with phrases like "define SLOs", "track SLI metrics", or "calculate error budget".
Implement Real User Monitoring (RUM) to capture actual user performance data including Core Web Vitals and page load times. Use when setting up user experience monitoring or tracking custom performance events. Trigger with phrases like "setup RUM", "track Core Web Vitals", or "monitor real user performance".
Setup synthetic monitoring for proactive performance tracking including uptime checks, transaction monitoring, and API health. Use when implementing availability monitoring or tracking critical user journeys. Trigger with phrases like "setup synthetic monitoring", "monitor uptime", or "configure health checks".
Automatically detect performance regressions in CI/CD pipelines by comparing metrics against baselines. Use when validating builds or analyzing performance trends. Trigger with phrases like "detect performance regression", "compare performance metrics", or "analyze performance degradation".
Analyze application logs for performance insights and issue detection including slow requests, error patterns, and resource usage. Use when troubleshooting performance issues or debugging errors. Trigger with phrases like "analyze logs", "find slow requests", or "detect error patterns".
Track and optimize application response times across API endpoints, database queries, and service calls. Use when monitoring performance or identifying bottlenecks. Trigger with phrases like "track response times", "monitor API performance", or "analyze latency".
Build real-time API monitoring dashboards with metrics, alerts, and health checks. Use when tracking API health and performance metrics. Trigger with phrases like "monitor the API", "add API metrics", or "setup API monitoring".
Aggregate and centralize performance metrics from applications, systems, databases, caches, and services. Use when consolidating monitoring data from multiple sources. Trigger with phrases like "aggregate metrics", "centralize monitoring", or "collect performance data".
Collect comprehensive infrastructure performance metrics across compute, storage, network, containers, load balancers, and databases. Use when monitoring system performance or troubleshooting infrastructure issues. Trigger with phrases like "collect infrastructure metrics", "monitor server performance", or "track system resources".
Use when you need to work with monitoring and observability. This skill provides health monitoring and alerting with comprehensive guidance and automation. Trigger with phrases like "monitor system health", "set up alerts", or "track metrics".
Use when you need to work with monitoring and observability. This skill provides health monitoring and alerting with comprehensive guidance and automation. Trigger with phrases like "monitor system health", "set up alerts", or "track metrics".
Track and optimize resource usage across application stack including CPU, memory, disk, and network I/O. Use when identifying bottlenecks or optimizing costs. Trigger with phrases like "track resource usage", "monitor CPU and memory", or "optimize resource allocation".