home/categories/security/mukul975-anthropic-cybersecurity-skills-skills-implementing-llm-guardrails-for-security-skill-md
securitytesting-security

implementing-llm-guardrails-for-security

Implements input and output validation guardrails for LLM-powered applications to prevent prompt injection, data leakage, toxic content generation, and hallucinated outputs. Builds a security validation pipeline using NVIDIA NeMo Guardrails Colang definitions, custom Python validators for PII detection and content policy enforcement, and the Guardrails AI framework for structured output validation. The guardrails system intercepts both user inputs (blocking injection attempts, stripping PII, enforcing topic boundaries) and model outputs (detecting hallucinations, filtering toxic content, validating JSON schema compliance). Activates for requests involving LLM output validation, AI content filtering, guardrail implementation, or LLM safety enforcement.

mukul975
maintainer
mukul975
Actualizado 4/6/2026
Estrellas
4240
Forks
464
quick start

Installation and usage

Implements input and output validation guardrails for LLM-powered applications to prevent prompt injection, data leakage, toxic content generation, and hallucinated outputs. Builds a security validation pipeline using NVIDIA NeMo Guardrails Colang definitions, custom Python validators for PII detection and content policy enforcement, and the Guardrails AI framework for structured output validation. The guardrails system intercepts both user inputs (blocking injection attempts, stripping PII, enforcing topic boundaries) and model outputs (detecting hallucinations, filtering toxic content, validating JSON schema compliance). Activates for requests involving LLM output validation, AI content filtering, guardrail implementation, or LLM safety enforcement.

Instalación
$ install --globalskills.sh
Uso

Después de instalarlo, puedes usar este skill ejecutando el siguiente comando en tu terminal:

skills use implementing-llm-guardrails-for-security