animsequence
Preview, validate, bake, and manipulate animation sequences with constraint-aware bone editing
Preview, validate, bake, and manipulate animation sequences with constraint-aware bone editing
Export simulation results and/or the computational mesh to VTK files for ParaView visualization. Use this when the user wants to export, visualize, or create VTK files.
Use when 用户只想通过 Gemini Gem 浏览器流程转写单个 YouTube 视频,而不是走完整的 BestBlogs 视频处理流水线。
Use when 用户想批量处理 BestBlogs 待分析视频,包括转录、分析、评分,以及按需翻译高分内容。
Guide for using ImageMagick command-line tools to perform advanced image processing tasks including format conversion, resizing, cropping, effects, transformations, and batch operations. Use when manipulating images programmatically via shell commands.
Guide for using FFmpeg - a comprehensive multimedia framework for video/audio encoding, conversion, streaming, and filtering. Use when processing media files, converting formats, extracting audio, creating streams, applying filters, or optimizing video/audio quality.
Guidance for picking photos/videos, capturing from camera, multi-select (.NET 10), MediaPickerOptions, platform permissions, and FileResult handling in .NET MAUI. USE FOR: "pick photo", "capture photo", "take picture", "pick video", "camera capture", "MediaPicker", "photo gallery", "image picker", "multi-select photos", "MediaPickerOptions". DO NOT USE FOR: general file picking (use maui-file-handling), image display or optimization (use maui-performance), or camera streaming (use maui-platform-invoke).
Use this skill when adding audio to programmatic videos - generating narration with ElevenLabs TTS, sourcing royalty-free background music, creating SFX with FFmpeg, implementing audio ducking, or mixing multiple audio layers in Remotion. Triggers on ElevenLabs, text-to-speech, voice generation, background music, sound effects, audio mixing, and volume ducking.
Use this skill when analyzing existing video files using FFmpeg and AI vision, extracting frames for design system generation, detecting scene boundaries, analyzing animation timing, extracting color palettes, or understanding audio-visual sync. Triggers on video analysis, frame extraction, scene detection, ffprobe, motion analysis, and AI vision analysis of video content.
Extracts timestamped transcripts from YouTube videos for translation, summarization, and content creation.
Create and edit videos using Google's Veo 2 and Veo 3 models. Supports Text-to-Video, Image-to-Video, Reference-to-Video, Inpainting, and Video Extension. Available parameters: prompt, image, mask, mode, duration, aspect-ratio. Always confirm parameters with the user or explicitly state defaults before running.
Image processing toolkit awareness. Use when: user uploads images for manipulation, requests format conversion, batch processing, compositing, resizing, optimization, analysis, effects, metadata inspection, montages, animated GIFs, color correction, or any image-related task. Also use when working with screenshots, photos, diagrams, icons, or visual assets. Triggers on 'resize', 'crop', 'convert', 'compress', 'optimize', 'thumbnail', 'watermark', 'montage', 'collage', 'gif', 'sprite sheet', 'color space', 'metadata', 'EXIF', 'compare images', 'diff', 'overlay', 'composite', 'batch process', 'image analysis', 'histogram', 'blur', 'sharpen', 'rotate', 'flip', 'border', 'shadow', 'round corners', 'favicon', 'icon set'.
Audio and video processing with ffmpeg. Use when: user asks to convert, trim, merge, compress, or transcode video or audio files; extract audio from video; create GIFs or animated WebP from video; add subtitles or watermarks to video; change video resolution, framerate, or codec; normalize audio loudness; extract frames from video; concatenate clips; create thumbnails from video; strip or add audio tracks; convert between audio formats (MP3, AAC, FLAC, Opus, WAV); adjust volume; apply video filters; stabilize shaky video; generate waveform or spectrum visualizations; probe media file metadata. Triggers on 'ffmpeg', 'video', 'audio', 'transcode', 'MP4', 'MKV', 'WebM', 'MP3', 'AAC', 'FLAC', 'Opus', 'WAV', 'GIF from video', 'extract audio', 'add subtitles', 'video to gif', 'compress video', 'trim video', 'merge videos', 'normalize audio', 'framerate', 'resolution', 'bitrate', 'codec', 'ffprobe', 'waveform', 'spectrogram'.
Augmented vision tools for analyzing images beyond native visual capabilities. Use when tasked with describing images in detail, reproducing images as SVGs, identifying subtle features, comparing image regions, reading degraded text, or any task requiring careful visual inspection. Also use when the image-to-svg skill needs ground truth about colors, shapes, or boundaries.
Agent-first media toolkit for image, video, and audio processing. Use when you need to resize, convert, generate images, remove backgrounds, extract audio, transcribe speech, or generate videos. All commands return deterministic JSON output.
Resizes an image to specified dimensions. Use when you need to change image size, create thumbnails, or prepare images for specific display requirements.
Extracts audio track from a video file. Use when you need to get audio from video, prepare audio for transcription, or separate audio from video content. Runs locally with no API key required.
Upscales an image using AI super-resolution to increase resolution with detail generation. Use when you need to enlarge images, improve low-resolution photos, or prepare images for large-format display.
Create 3D pan/swivel transition effects for videos using Remotion. Use when user asks to add 3D transitions, create swivel effects, or add video transitions.
Transcribe audio and video files using the Deepgram API. This skill should be used when the user requests transcription of audio files (mp3, wav, m4a, aac) or video files (mp4, mov, avi, etc.). Handles large video files by extracting audio first to reduce upload size and processing time.
Use when user provides a topic and wants an automated video podcast created - handles research, script writing, TTS audio synthesis, Remotion video creation, and final MP4 output with background music
Complete video editing toolkit - silence removal, auto-captions, vertical crop, YouTube clipping, 3D transitions, and social media compression. Use when user asks to edit video, remove silences, add captions/subtitles, crop to vertical/shorts, download YouTube clips, compress video, or create video teasers.
Transcribes audio to text with timestamps and optional speaker identification. Use when you need to convert speech to text, create subtitles, transcribe meetings, or process voice recordings.