video-processing
Trim, transcode, extract frames, add subtitles, and manipulate audio in video files using FFmpeg.
Trim, transcode, extract frames, add subtitles, and manipulate audio in video files using FFmpeg.
Design video concepts, scripts, shotlists, transitions, and editing notes for VEO, Gemini, and Nano Banana-based pipelines. Use when turning a marketing idea into concrete video assets.
Generate logos using Replicate AI and make them transparent with background removal.
Process video files with audio extraction, format conversion (mp4, webm), and Whisper transcription. Use when user mentions video conversion, audio extraction, transcription, mp4, webm, ffmpeg, or whisper transcription.
Download and process YouTube video transcripts using yt-dlp. Use this when extracting subtitles, creating summaries from videos, or processing video content.
Add image vision to ClaudeClaw agents. Resizes and processes WhatsApp image attachments, then sends them to Claude as multimodal content blocks.
FFmpeg automation for cutting, trimming, concatenating videos. Audio mixing, timeline editing, transitions, effects. Export optimization for YouTube, social media. Subtitle handling, color grading, batch processing. Use for videogen projects, content creation, automated video production. Activate on "video editing", "FFmpeg", "trim video", "concatenate", "transitions", "export optimization". NOT for real-time video editing UI, 3D compositing, or motion graphics.
Expert in voice synthesis, TTS, voice cloning, podcast production, speech processing, and voice UI design via ElevenLabs integration. Specializes in vocal clarity, loudness standards (LUFS), de-essing, dialogue mixing, and voice transformation. Activate on 'TTS', 'text-to-speech', 'voice clone', 'voice synthesis', 'ElevenLabs', 'podcast', 'voice recording', 'speech-to-speech', 'voice UI', 'audiobook', 'dialogue'. NOT for spatial audio (use sound-engineer), music production (use DAW tools), game audio middleware (use sound-engineer), sound effects generation (use sound-engineer with ElevenLabs SFX), or live concert audio.
Use when editing videos for Xiaohongshu, creating short video content, adding effects and transitions to videos, or needing to add subtitles and music to video clips
Manipulate images locally using Python and PIL/Pillow. Use when the user asks to resize, crop, rotate, flip, filter, enhance, combine, overlay, watermark, add text to, convert, compress, create, or edit images locally. Also use for thumbnails, borders, color adjustments, transparency, animated GIFs, or extracting image metadata.
Implement, review, or improve photo picking, camera capture, and media handling in iOS apps. Use when working with PhotosPicker, PHPickerViewController, camera capture sessions (AVCaptureSession), photo library access, image loading and display, video recording, or media permissions. Also use when selecting photos from the library, taking pictures, recording video, processing images, or handling photo/camera privacy permissions in Swift apps.
Places album art files in the correct audio and content directory locations. Use when the user has generated or downloaded album artwork that needs to be saved.
Moves audio files to the correct album location with proper path structure. Use when the user has downloaded WAV files from Suno or other sources that need to be organized.
Polishes raw Suno audio by processing per-stem WAVs (vocals, backing_vocals, drums, bass, guitar, keyboard, strings, brass, woodwinds, percussion, synth, other) with targeted cleanup, EQ, and compression, then remixing into a polished stereo WAV ready for mastering. Use after audio import and before mastering.
Converts mastered audio to sheet music and creates printable songbooks. Use after mastering when the user wants sheet music or a songbook for their album.
Generates 15-second vertical promo videos for social media from mastered audio. Use after mastering is complete and before release, when the user wants social media content.
Moves track markdown files to the correct album location. Use when the user has track files in Downloads or other locations that need to be placed in an album.
Guides audio mastering for streaming platforms including loudness optimization and tonal balance. Use when the user has approved tracks and wants to master audio files.
Correct subtitle files (.srt) generated from speech recognition. Use when the user uploads subtitle files and asks to correct, fix, or proofread subtitles, especially for technical content like programming tutorials, AI/ML courses, or any content with domain-specific terminology. Supports Chinese and English subtitles with intelligent error detection and correction while preserving exact timeline information.
Use when user needs to merge video clips, add audio, and generate the final video