unsloth-sft
Supervised fine-tuning using SFTTrainer, instruction formatting, and multi-turn dataset preparation with triggers like sft, instruction tuning, chat templates, sharegpt, alpaca, conversation_extension, and SFTTrainer.
ai-solution-architect
Advanced AI Solution Architect wisdom for designing production-grade AI systems. Use when (1) architecting new agents, workflows, or AI features, (2) debugging agent behavior or performance issues, (3) making technology choices (models, frameworks, observability), (4) reviewing AI system designs for production-readiness, (5) planning agent orchestration with LangGraph, (6) designing memory/context systems, (7) optimizing LLM costs and latency, or (8) building email marketing AI optimization systems. This skill thinks like a senior architect at OpenAI/Anthropic - questioning assumptions, anticipating failures, and designing for the AI-native future.
openrouter
Use this skill when the user wants to call different LLM models through OpenRouter's unified API, compare model responses, track costs and response times, or find the best model for a task. Triggers include requests to test models, benchmark performance, use specific providers (OpenAI, Anthropic, Google, etc.), or optimize for speed/cost.
github-copilot
Consult other AI models via GitHub Copilot CLI for second opinions, thorough analysis, or alternative perspectives. Supports Gemini 3 Pro Preview (gemini), Claude Opus 4.5 (opus), Claude Sonnet 4.5 (sonnet) and GPT-5.1-Codex-Max (codex). Use when user explicitly requests, when needing detailed analysis, when requiring additional help with an especially complex task, or when seeking alternative model perspectives.
models-dev
Query AI model specifications, pricing, and capabilities from models.dev database. Use when users ask about AI model parameters (context window, token limits, cost per token), model comparisons, provider information, or need to look up specific model IDs for AI SDK integration. Triggers on queries like "What's the context window for GPT-4o?", "Compare Claude vs GPT", "How much does Gemini Pro cost?", "List OpenAI models", or "What models support tool calling?".
gpu-inference-server
Set up AI inference servers on cloud GPUs. Create private LLM APIs (vLLM, TGI), image generation endpoints, embedding services, and more. All with OpenAI-compatible interfaces that work with existing tools.
bedrock-fine-tuning
Amazon Bedrock Model Customization with fine-tuning, continued pre-training, reinforcement fine-tuning (NEW 2025 - 66% accuracy gains), and distillation. Create customization jobs, monitor training, deploy custom models, and evaluate performance. Use when customizing Claude, Titan, or other Bedrock models for domain-specific tasks, adapting to proprietary data, improving accuracy on specialized workflows, or distilling large models to smaller ones.
uncertainty-routing
Route tasks to small model by default, escalate to large model only on low confidence detection, achieving 87% faster learning and 10-30x cost reduction while maintaining accuracy. Use for cost optimization, confidence-based delegation, routine vs complex task routing, and resource efficiency. Triggers on "optimize cost", "model routing", "confidence threshold", "small model first", "escalate on uncertainty".
multi-model
Multi-model orchestration for ARIA. Activate when dispatching tasks to different LLM providers (OpenAI, local models, cloud providers) or when optimizing cost/latency tradeoffs.
senior-computer-vision
World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.
unsloth-models
Guidance on selecting and configuring supported model architectures like Llama 4, DeepSeek-R1, and Qwen3. Triggers: llama 4, deepseek-r1, qwen3, gemma 3, model selection, instruct vs base.
ralph-autonomous
Autonomous coding patterns for unattended feature implementation using Ralph loop runner. Use when running Ralph autonomous loop, implementing features without supervision, handling long-running agent sessions, or when context mentions Ralph, autonomous, unattended, or batch implementation.
moai-domain-ml
Machine learning model training, evaluation, deployment, and MLOps workflows.
ai-dev-guidelines
Comprehensive AI/ML development guide for LangChain, LangGraph, and ML model integration in FastAPI. Use when building LLM applications, agents, RAG systems, sentiment analysis, aspect-based analysis, chain orchestration, prompt engineering, vector stores, embeddings, or integrating ML models with FastAPI endpoints. Covers LangChain patterns, LangGraph state machines, model deployment, API integration, streaming, error handling, and best practices.
senior-prompt-engineer
Expert prompt engineering for LLM applications including prompt design, optimization, RAG systems, agent architectures, and AI product development.
llm-knowledge
This skill should be used when the user asks "what is LoRA", "compare models", "which model is best for Chinese", "SFT vs DPO", "how to handle overfitting", "class imbalance solution", "model architecture", "training method comparison", or needs reference information about LLM fine-tuning. Provides structured knowledge base for models, methods, architectures, and troubleshooting.
ai-engineer
Expert in building comprehensive AI systems, integrating LLMs, RAG architectures, and autonomous agents into production applications. Use when building AI-powered features, implementing LLM integrations, designing RAG pipelines, or deploying AI systems.
prompt-engineering-patterns
Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability in production. Use when optimizing prompts, improving LLM outputs, or designing production prompt templates.
claude-opus-4-5-guide
Comprehensive guide to Claude Opus 4.5, Anthropic's most intelligent model with effort parameter for reasoning control. Covers model capabilities, benchmarks, effort levels (high/medium/low), hybrid reasoning, and model selection. Use when working with Opus 4.5, optimizing reasoning depth, choosing models, or understanding effort parameter trade-offs.
senior-prompt-engineer
World-class prompt engineering skill for LLM optimization, prompt patterns, structured outputs, and AI product development. Expertise in Claude, GPT-4, prompt design patterns, few-shot learning, chain-of-thought, and AI evaluation. Includes RAG optimization, agent design, and LLM system architecture. Use when building AI products, optimizing LLM performance, designing agentic systems, or implementing advanced prompting techniques.