home/categories/media

category focus

Media

Audio, video, and image processing.

1476 skillsall categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

media

audio-transcription

Transcribe uploaded audio files (MP3, WAV, M4A) to text using OpenAI Whisper. Generate clean text or SRT subtitles. Works offline after initial setup.

aryankumar06

content-media

open

media

Image optimization analysis for SEO and performance. Checks alt text, file sizes, formats, responsive images, lazy loading, and CLS prevention. Use when user says "image optimization", "alt text", "image SEO", "image size", or "image audit".

norahe0304-art

content-media

open

media

giil

Get Image [from] Internet Link - Zero-setup CLI for downloading full-resolution images from iCloud, Dropbox, Google Photos, and Google Drive share links. Four-tier capture strategy, browser automation, HEIC conversion, album support. Node.js/Playwright.

diegosouzapw

content-media

open

media

fixed-video-format-916

Fixed 1080x1920 pixel video format with percentage-based positioning. Use this when laying out video compositions, positioning elements on the canvas, or calculating dimensions. All videos render at exactly 9:16 aspect ratio for TikTok/Instagram Reels.

diegosouzapw

content-media

open

media

adaptive-bitrate

Adaptive Bitrate (ABR) streaming automatically adjusts video quality based on network conditions. This guide covers HLS, DASH, and player implementation for building video streaming solutions that pro

diegosouzapw

content-media

open

media

veo

Generate video using Google Veo (Veo 3.1 / Veo 3.0). Use when: creating video clips from text prompts, generating B-roll, making animated content. DON'T use when: editing existing videos (use ffmpeg/video-frames), extracting frames from video (use video-frames skill), or adding subtitles (use video-subtitles skill).

diegosouzapw

content-media

open

media

performing-steganography-detection

Detect and extract hidden data embedded in images, audio, and other media files using steganalysis tools to uncover covert communication channels.

diegosouzapw

content-media

open

media

image-optimizer

Optimize and compress images for web use. Reduces file sizes of JPEG, PNG, GIF images using lossy/lossless compression. Can resize images to maximum dimensions, convert to WebP format, and process entire directories recursively. Use when images are too large for web, need compression, or need format conversion.

diegosouzapw

content-media

open

media

ffmpeg-usage

ffmpeg recipes and best practices: convert, concatenate, merge, resize, compress, GIF creation, audio extraction, subtitles, optimize for social platforms.

diegosouzapw

content-media

open

media

cv181x-media

Expert guide for CV181X/CV182X/CV180X (SG200X) multimedia development using CVI MPI API. Use this skill when working with: VI (video input/camera/ISP), VPSS (video processing/scaling/crop), VENC (H.264/H.265/JPEG encoding), VDEC (decoding), VB (video buffer pools), SYS binding, or any CVI_* API calls. Covers camera pipeline setup, offline VPSS processing, VB pool planning, and error diagnosis (ERR_VPSS_NOBUF, ERR_VB_NOBUF). API details in references/.

diegosouzapw

content-media

open

media

qasai

Image compression CLI with lossless/lossy options, multiple engines, batch processing, and format conversion. Use when compressing, optimizing, or converting images.

diegosouzapw

content-media

open

media

youtube-transcripter

Extract a YouTube video transcript from a URL and summarize it into important content, learnings, and suggestions. Use when the user provides a YouTube link and wants a clean transcript and structured notes without timestamps, with fallback extraction methods if the primary tool is unavailable.

diegosouzapw

content-media

open

media

video-sdkweb

Zoom Video SDK for Web - JavaScript/TypeScript integration for browser-based video sessions, real-time communication, screen sharing, recording, and live transcription

zoom

content-media

open

media

video-processing-editing

FFmpeg automation for cutting, trimming, concatenating videos. Audio mixing, timeline editing, transitions, effects. Export optimization for YouTube, social media. Subtitle handling, color grading, batch processing. Use for videogen projects, content creation, automated video production. Activate on "video editing", "FFmpeg", "trim video", "concatenate", "transitions", "export optimization". NOT for real-time video editing UI, 3D compositing, or motion graphics.

diegosouzapw

content-media

open

media

video

Generate videos using fal.ai (Wan, Kling) or Sora. Text-to-video and image-to-video.

diegosouzapw

content-media

open

media

video-toolkit

Intelligent video processor for downloading media and extracting transcripts from YouTube and 1000+ supported sites. Automatically handles format selection, subtitle extraction, and post-processing.

diegosouzapw

content-media

open

media

smart-short-video

智能短影片生成器 - 混合 AI 圖片與原始影片片段

diegosouzapw

content-media

open

media

doubao-watermark-remover

Remove the visible Doubao (豆包) AI watermark from images. Use when asked to remove Doubao watermarks, clean Doubao-generated images, or process images with the "豆包AI生成" watermark.

diegosouzapw

content-media

open

media

remotion-thumbnail

Generate professional YouTube thumbnails with AI-powered expression cutouts and Remotion rendering. Perfect for content creators who want consistent, high-quality thumbnails at scale.

diegosouzapw

content-media

open

media

youtube-summarize

Summarize YouTube videos by extracting and analyzing their transcripts. Use when a user shares a YouTube URL and asks to summarize, explain, analyze, or extract information from the video. Triggers on YouTube links (youtube.com, youtu.be) with requests like "summarize this video", "what's this video about", "give me the key points", or "explain this video".

diegosouzapw

content-media

open

media

video-processor

Process video files with audio extraction, format conversion (mp4, webm), and Whisper transcription. Use when user mentions video conversion, audio extraction, transcription, mp4, webm, ffmpeg, or whisper transcription.