opentelemetry
OpenTelemetry observability patterns: traces, metrics, logs, context propagation, OTLP export, Collector pipelines, and troubleshooting
OpenTelemetry observability patterns: traces, metrics, logs, context propagation, OTLP export, Collector pipelines, and troubleshooting
Production gRPC in Go: protobuf layout, codegen, interceptors, deadlines, error codes, streaming, health checks, TLS, and testing with bufconn
Full-stack observability with Datadog APM, logs, metrics, synthetics, and RUM. Use when implementing monitoring, tracing, alerting, or cost optimization for production systems.
Deploy and configure Zipkin for distributed tracing and request flow visualization. Use when a user needs to set up trace collection, instrument Java/Spring or other services with Zipkin, analyze service dependencies, or configure storage backends for trace data.
Deploy and configure Thanos for long-term Prometheus metric storage, global querying across multiple Prometheus instances, and data compaction. Use when a user needs durable metric storage in object storage, a unified query view across clusters, downsampling for historical data, or high-availability Prometheus with deduplication.
Configure Telegraf as a metrics collection agent for infrastructure and application monitoring. Use when a user needs to collect system metrics, set up input plugins for databases and services, configure output to InfluxDB or Prometheus, or build custom metric pipelines.
Configure Prometheus Alertmanager for alert routing, grouping, silencing, and notification delivery. Use when a user needs to set up alert receivers (Slack, PagerDuty, email), define routing trees, manage silences and inhibition rules, or troubleshoot alert delivery pipelines.
Expert guidance for SigNoz, the open-source observability platform that provides traces, metrics, and logs in a single UI. Built natively on OpenTelemetry, SigNoz is a self-hosted alternative to Datadog and New Relic. Helps developers set up distributed tracing, application performance monitoring, log management, and custom dashboards.
You are an expert in Traceloop and its OpenLLMetry SDK, the open-source observability framework that extends OpenTelemetry for LLM applications. You help developers instrument AI pipelines with automatic tracing for OpenAI, Anthropic, Cohere, LangChain, LlamaIndex, vector databases, and frameworks — exporting to any OpenTelemetry-compatible backend (Grafana Tempo, Jaeger, Datadog, Honeycomb, Traceloop Cloud).
Deploy and use Jaeger for distributed tracing across microservices. Use when a user needs to set up trace collection, instrument applications with OpenTelemetry, analyze trace data to find latency bottlenecks, or configure Jaeger storage backends and sampling strategies.
You are an expert in Langtrace, the open-source observability platform for LLM applications built on OpenTelemetry. You help developers trace LLM calls, RAG pipelines, agent tool use, and chain executions with automatic instrumentation for OpenAI, Anthropic, LangChain, LlamaIndex, and 20+ providers — providing cost tracking, latency analysis, token usage, and quality evaluation in a self-hostable dashboard.
Loki is a horizontally scalable log aggregation system built by Grafana Labs. Unlike traditional log platforms that index the full text of every log line, Loki indexes only metadata labels, making it significantly cheaper to operate. It integrates natively with Grafana for querying and visualization, and uses LogQL — a query language inspired by PromQL — to filter, parse, and aggregate log streams.
Set up and manage New Relic for full-stack observability including APM, browser monitoring, infrastructure monitoring, and alerting. Use when a user needs to instrument applications, write NRQL queries, create dashboards, configure alert policies, or integrate New Relic with their deployment pipeline.
Set up end-to-end observability for microservices. Use when someone asks to "add tracing", "set up monitoring", "configure OpenTelemetry", "build Grafana dashboards", "distributed tracing", "structured logging", "metrics collection", or "debug production issues". Covers OpenTelemetry instrumentation, collector configuration, Grafana LGTM stack deployment, dashboard provisioning, and alert rules.
Deploy and configure VictoriaMetrics as a high-performance time-series database for metrics storage and querying. Use when a user needs a Prometheus-compatible long-term storage backend, wants to write MetricsQL queries, configure vmagent for metrics scraping, or set up VictoriaMetrics cluster mode for horizontal scaling.
Grafana is an open-source visualization and dashboarding platform that connects to dozens of data sources including Prometheus, PostgreSQL, ClickHouse, and Elasticsearch. It lets you build interactive dashboards with panels, set up alerting rules, and manage everything as code through JSON dashboard definitions and provisioning configuration.
Define success criteria and tracking setup for launch during PRD v0.9 Go-to-Market. Triggers on requests to define launch metrics, set up tracking, or when user asks "how do we measure launch success?", "launch KPIs", "tracking setup", "success criteria", "analytics", "launch goals". Outputs KPI- entries specialized for launch measurement.
Approve a specification and export it to beads as an epic with tasks. Creates the epic, child tasks, dependencies, and agent assignment labels. Use after reviewing the spec from /start-discovery.
Define monitoring strategy, metrics collection, and alerting thresholds during PRD v0.8 Deployment & Ops. Triggers on requests to set up monitoring, define alerts, or when user asks "what should we monitor?", "alerting strategy", "observability", "metrics", "SLOs", "dashboards", "monitoring setup". Outputs MON- entries with monitoring rules and alert configurations.
Defines database performance monitoring strategy with slow query detection, resource usage alerts, query execution thresholds, and automated alerting. Use for "database monitoring", "performance alerts", "slow queries", or "DB metrics".
Creates SLO-based alerts and operational dashboards with key charts, alert thresholds, and runbook links. Use for "alerting", "dashboards", "SLO", or "monitoring".
PolicyEngine API v2 - Next-generation microservices architecture with monorepo structure
Correctly place directional blocks (doors, levers, torches, signs, banners, stairs) in Minecraft via RCON