ag-7-qualidade
Maquina autonoma de qualidade (wrapper MERIDIAN). Descobre app, testa 5 dimensoes (ALIVE/REAL/WORKS/LOOKS/FEELS), corrige, re-testa ate convergencia MQS >= 85. Quality Certificate.
Maquina autonoma de qualidade (wrapper MERIDIAN). Descobre app, testa 5 dimensoes (ALIVE/REAL/WORKS/LOOKS/FEELS), corrige, re-testa ate convergencia MQS >= 85. Quality Certificate.
Calibration testing for agents and skills. Generates synthetic problems with known outcomes (quasi-ground-truth), runs targets against them, and measures recall, precision, and confidence calibration — revealing whether self-reported confidence scores track actual quality.
Use Agriculture and Food Board infectious animal disease sources for control measures, surveillance context, and operational response references.
Assess what a ready Faber task produced and act on it. Use after faber watch returns to inspect the diff, form a judgment, and route the task to merge, done, continue, or delete.
SAENMモデル (Sequential Append-Only Experiment Notebook Model) に基づくJupyter実験ノートの 作成・管理・振り返りワークフロー。「SAENM」「実験ノート」「実験記録」「ノートブック管理」 「研究ノート」等のキーワードで起動。新規実験の開始、既存実験の継続、実験系列の振り返りを支援する。
This skill should be used when analyzing clock domain crossings for synchronizer coverage and metastability risks.
Diagnoses why tests pass inconsistently and suggests fixes for timing, ordering, and state isolation issues.
Use for LawnBerry Pi simulation vs real-hardware validation. Covers SIM_MODE selection, .env and config checks, simulation-safe test-first workflow, hardware preflight expectations, and avoiding false claims that local success proves on-device behavior.
Review regression-sensitive LawnBerry Pi manual control and camera paths: RoboHAT USB handoff, watchdog feeding, motor authorization, joystick responsiveness, MJPEG and snapshot fallback, camera ownership, and stream backpressure handling.
Convert research ideas into falsifiable hypotheses and experiment plans with metrics, controls, pass/fail criteria, and confounder mitigation.
When the user wants to plan, design, or implement an A/B test or experiment. Also use when the user mentions "A/B test," "split test," "experiment," "test this change," "variant copy," "multivariate test," or "hypothesis." For tracking implementation, see analytics-tracking.
Enforce experiment hygiene for this EDA repo. Utility skill for knowledge gate, tool reuse checks, and maintenance-log updates during scoped execution.
Perform post-experiment retrospective for EDA runs, classify failure/success mechanisms, propose high-confidence next actions, and decide whether to recursively trigger a new eda-loop iteration. Use after each experiment batch with monitor/summary/manifest artifacts.
Guide for sending document pages to Label Studio for human feedback on transcriptions. Use when: feedback, label studio, annotation, review transcription, quality check, send to label studio, import to label studio, human review, correct transcription, segmentation feedback, transcription feedback, ALTO to label studio, review pages, flag pages for review, quality assurance, QA, annotation task, annotate images, label images, send images for annotation.
Implement sample verification, docs-generation checks, CI wiring, and other repository tooling changes for the analyzer package.
A/B testing and experimentation workflow: hypothesis design, metric selection, sample size calculation, statistical significance, common pitfalls (peeking, SRM, novelty effect), and experiment lifecycle. Complements feature-flags (implementation) with statistical rigor.
Track medications with dosage schedules, log doses taken or skipped, monitor adherence rates, manage refill reminders, and check basic drug interactions. Use when a user needs to manage their medication schedule, log dose history, or review adherence.
Audit evidence collection and trail verification. Gathers artifacts, validates controls, generates audit reports, and maintains compliance documentation. Use when: "audit trail", "collect evidence", "audit report", "control testing", "compliance documentation"
Design statistically rigorous A/B tests and experiments. Formulate hypotheses, select metrics, calculate sample sizes. Discovers analytics and feature flag tools via capability detection. Use when: "design experiment", "A/B test", "hypothesis", "sample size", "what metrics", "test my feature", "should we experiment"
Use when reviewing current branch for bugs before pushing or merging, when wanting a thorough multi-agent review of local changes, or when preparing work for human review
Use when reviewing current branch for bugs before pushing or merging, when wanting a thorough multi-agent review of local changes, or when preparing work for human review