image-duplicate-detection
Complete workflow for detecting duplicate and near-duplicate images using MD5 hashes and perceptual hashing (dHash/pHash). Use when implementing duplicate detection features.
CMS, document processing, and media generation.
Complete workflow for detecting duplicate and near-duplicate images using MD5 hashes and perceptual hashing (dHash/pHash). Use when implementing duplicate detection features.
Build or modify the browser-side recording and upload pipeline for whisper-lolo. Use when implementing MediaRecorder + IndexedDB chunking, assembling audio blobs, or configuring Vercel Blob client uploads with progress and callbacks.
Video upload patterns for YouTube, TikTok, and Vimeo. Use when uploading videos to platforms, managing video metadata, scheduling video releases, or handling bulk video uploads.
Optimizes images and generates responsive markup. Use when the user asks about image formats (WebP, AVIF), srcset, responsive images, Next.js Image, or reducing image file sizes.
Improves image quality (resolution, sharpness, clarity) for screenshots, presentations, and social media. Analyzes specs and applies specific enhancements.
Download videos from various platforms (YouTube, Vimeo, etc.) for offline viewing and archiving
Convert multiple video files (MOV/MP4) into a single merged GIF with customizable speed per segment. Use this skill when users want to: - Merge multiple videos into one GIF - Create demo GIFs from screen recordings - Combine video clips with different playback speeds - Convert videos to optimized GIFs with compression Triggers: "create GIF from videos", "merge videos to GIF", "convert MOV to GIF", "combine videos into animated GIF"
Crop photos intelligently based on natural language prompts using GPT-5 vision analysis. Use when the user asks to crop or trim a photo, remove parts of an image, focus on a specific subject, improve composition, or remove distractions from edges.
You are the on-device audio ML specialist for Modcaster's AI-driven audio processing.
Generate video from first and last frame images using fal.ai Veo 3.1. Use when the user wants to create a video transition between two images, morph between scenes, or generate smooth video connecting a starting and ending frame.
Implement image processing for PhotoVault using Sharp and streaming patterns. Use when working with photo uploads, thumbnail generation, EXIF handling, ZIP extraction, or optimizing images for web. Includes memory management for serverless and PhotoVault storage structure.
Download audio and video from thousands of websites using yt-dlp. Feature-rich command-line tool supporting format selection, subtitle extraction, playlist handling, metadata embedding, and post-processing. This skill is triggered when the user says things like "download this video", "download from YouTube", "extract audio from video", "download this playlist", "get the mp3 from this video", "download subtitles", or "save this video locally".
Process videos using ffmpeg and gifsicle to edit videos. Use when working with .mp4, .avi, or .gif files, or sequences of .png or .jpg files, and the request is to compress, convert to a different format, scale, crop, remove or extract frames, concatenate multiple videos in space or in time, or speed up or slow down the video.
Extracts frames at regular intervals from dashcam videos to create compact visual summaries of vehicle movement and location changes. This skill should be used when users need motion trajectory analysis, want to optimize dashcam storage by 90%+, need quick visual review of hours of footage, or want to create visual timelines of trips.
Extract YouTube video transcripts/subtitles using yt-dlp. Handles rate limiting, auto-generated captions, multiple languages, and VTT parsing.
Concatenate multiple MP4 files in order into a single MP4 using ffmpeg stream copy.
Enhance image quality using upscaling, denoising, color correction, and AI-powered tools
裁剪视频片段,支持压缩、音频控制等选项。Use when user wants to 剪辑视频, 裁剪视频, 截取视频, 视频剪切, 切视频, trim video, cut video, clip video, extract video segment.
Validates image file signatures using Magic Bytes and calculates SHA-256 cryptographic hash
Normalizes image histograms to preserve melanin features using Canvas API before analysis
Guide for using ImageMagick command-line tools to perform advanced image processing tasks including format conversion, resizing, cropping, effects, transformations, and batch operations. Use when manipulating images programmatically via shell commands.
Analyze and understand videos using Google's Gemini API. Use when the user asks to analyze, understand, describe, summarize, transcribe, or extract information from videos. Supports local video files (MP4, MOV, WebM, etc.) and YouTube URLs. Can answer questions about video content, describe scenes, identify objects/people/actions, extract text/timestamps, and more. Use this skill when user provides a video file path or YouTube link and wants to understand its content.
Modern responsive image techniques using picture element, srcset, sizes, and modern formats. Use when adding images that need to adapt to different screen sizes, resolutions, or support modern image formats.