benchclaw
BenchClaw 是 OpenClaw Agent 的专业级“安兔兔”评测框架。它专注于对 AI Agent 进行多维度、 自动化的量化评估与能力基准测试,集成了任务分发、精准评分、可视化报表生成及热更新功能。 当需要量化 Agent 的推理规划、响应速度、Token 成本及安全性时使用。 **用户意图/指令**:跑分、跑个分、运行基准测试、评估 Agent 表现、生成评测报告、分析 Token 消耗。 **技术关键词**:跑分、跑个分、Agent 评测、基准测试、自动化打分、量化评估、性能报告、Token 成本、 TPS、OpenClaw。 BenchClaw is the "AnTuTu" for OpenClaw Agents—a professional-grade automated benchmarking framework. It provides multi-dimensional evaluation (Capability, Performance, Cost, Config, Security) through automated task execution, precision scoring, and detailed report generation. **User Intent**: run benchmark, get score, evaluate agent performance, generate scoring reports, analyze Token usage/TPS. **Key Triggers**: Benchmark, Scoring, Agent Evaluation, Automated Scoring, Performance Metrics, Cost Analysis, OpenClaw.
Installation and usage
BenchClaw 是 OpenClaw Agent 的专业级“安兔兔”评测框架。它专注于对 AI Agent 进行多维度、 自动化的量化评估与能力基准测试,集成了任务分发、精准评分、可视化报表生成及热更新功能。 当需要量化 Agent 的推理规划、响应速度、Token 成本及安全性时使用。 **用户意图/指令**:跑分、跑个分、运行基准测试、评估 Agent 表现、生成评测报告、分析 Token 消耗。 **技术关键词**:跑分、跑个分、Agent 评测、基准测试、自动化打分、量化评估、性能报告、Token 成本、 TPS、OpenClaw。 BenchClaw is the "AnTuTu" for OpenClaw Agents—a professional-grade automated benchmarking framework. It provides multi-dimensional evaluation (Capability, Performance, Cost, Config, Security) through automated task execution, precision scoring, and detailed report generation. **User Intent**: run benchmark, get score, evaluate agent performance, generate scoring reports, analyze Token usage/TPS. **Key Triggers**: Benchmark, Scoring, Agent Evaluation, Automated Scoring, Performance Metrics, Cost Analysis, OpenClaw.
安装后,您可以通过在终端运行以下命令来使用此技能:
skills use benchclaw