testingtesting-security
aiml-moderation-content
Content moderation benchmark with 3 variants: guard toxic text (CSV), guard user input prompts (TXT), guard model output responses (JSONL). Use when: testing ISC on content moderation, generating toxic text samples, attack prompts, or unsafe model responses. Keywords: moderation, toxic, hate speech, jailbreak, harmful compliance.
maintainer
wuyoscar
更新于 4/10/2026
星标
785
分支
125
quick start
Installation and usage
Content moderation benchmark with 3 variants: guard toxic text (CSV), guard user input prompts (TXT), guard model output responses (JSONL). Use when: testing ISC on content moderation, generating toxic text samples, attack prompts, or unsafe model responses. Keywords: moderation, toxic, hate speech, jailbreak, harmful compliance.
安装
$ install --globalskills.sh
使用
安装后,您可以通过在终端运行以下命令来使用此技能:
skills use aiml-moderation-content