home/categories/media

category focus

Media

Audio, video, and image processing.

1476 スキルall categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

media

youtube-content

YouTube research and content operations — search, download, transcript extraction, and audio processing via yt-dlp.

AlexAI-MCP

content-media

open

media

audio-transcribe

录音文件转录能力。将音频文件转为结构化文本（带时间戳 + 说话人标识）。这是一个场景能力 skill，类似 web-access——只负责"把声音变成文字"，不决定文字用来做什么。触发场景：用户提供录音文件、要求转录音频、处理语音文件。

eze-is

content-media

open

media

Get Image [from] Internet Link - Zero-setup CLI for downloading full-resolution images from iCloud, Dropbox, Google Photos, and Google Drive share links. Four-tier capture strategy, browser automation, HEIC conversion, album support. Node.js/Playwright.

Dicklesworthstone

content-media

open

media

fal-upscale

Upscale and enhance image resolution using AI. Use when the user requests "Upscale image", "Enhance resolution", "Make image bigger", "Increase quality", or similar upscaling tasks.

CraftOS-dev

content-media

open

media

nano-banana-pro

Generate/edit images with Nano Banana Pro (Gemini 3 Pro Image). Use for image create/modify requests incl. edits. Supports text-to-image + image-to-image; 1K/2K/4K; use --input-image.

CraftOS-dev

content-media

open

media

youtube

Process YouTube videos — extract insights, answer questions, store as knowledge. 5 credits per video. Triggers on: youtube, video, process video, watch this, learn from video.

atrislabs

content-media

open

media

nano-banana-pro

Generate or edit images via Gemini 3 Pro Image (Nano Banana Pro).

liteclaw

content-media

open

media

discord

Send native voice messages and text to Discord channels via the discord CLI. Use when sending audio as a Discord voice message or checking channel config.

joeyhipolito

content-media

open

media

Send voice messages and text to Telegram chats via the telegram CLI. Use when sending audio as a Telegram voice message or checking chat config.

joeyhipolito

content-media

open

media

imax-portrait

Expands and recomposes images into an IMAX 70mm portrait style (1.43:1 aspect ratio) with high-fidelity Christopher Nolan-esque aesthetics.

ShinChven

content-media

open

media

anime-to-life

Transforms anime, art, or 3D rendering images into photorealistic cosplay-style photographs.

ShinChven

content-media

open

media

photo-restoration

Restores vintage and blurry photos to high-definition 8k images while preserving identity.

ShinChven

content-media

open

media

video-ffmpeg-processing

Process videos with FFmpeg — improve quality, auto-contrast, downsample, denoise, or crop using the video_ffmpeg_process tool.

healthonrails

content-media

open

media

edit

Edit a video using the video-editing orchestrator. Detects format, routes to correct editing skill, and executes the full process. Use when user says: /edit, edit video, edit clip, edit this, make it polished, finish editing, polish this video.

Trejon-888

content-media

open

media

extracting-transcripts

Extracts transcripts from video files using local WhisperX (preferred) or faster-whisper with GPU acceleration. Use when the user needs to transcribe a video, get captions, extract audio text, or convert video/audio to text. Triggers on "transcript", "transcribe", "speech to text", "video to text", "extract captions".

Trejon-888

content-media

open

media

clip-extractor

Extract and intelligently reframe clips from long-form 16:9 videos into 9:16 portrait or 1:1 square formats with face-tracking crop. Use for 'extract clips', 'reframe video', 'clip extractor', 'portrait crop', 'face tracking', '16:9 to 9:16', 'smart crop', 'make shorts from video', 'auto reframe'.

Trejon-888

content-media

open

media

render-topologies

Local visual regression check for layout or rendering changes. Renders all gallery examples, pixel-diffs against main, and opens changed renders as BEFORE/AFTER pairs. In most cases the CI render preview on a PR is sufficient - use this skill only for pre-push confidence on risky changes or when the user explicitly asks for a local diff.

pinin4fjords

content-media

open

media

video-editing

Video editing orchestrator and router. Detects video format (long-form vs short-form) and routes to the correct editing skill. Also provides shared component library, brand assets, and rules used by all editing processes. Use when user wants to edit video, create composition, add effects, make it polished, or finish the video.

Trejon-888

content-media

open

media

short-form-editing

Edit short-form videos (under 90 seconds) using Remotion compositions. Handles pipeline clips, standalone demos, and announcements with pop-outs, captions, SFX, and CTAs. Use for "short-form editing", "edit clip", "edit short", "pipeline clip editing", "edit demo", "short video editing", "edit announcement", "reels editing", "shorts editing", "tiktok editing".

Trejon-888

content-media

open

media

video-upload-helper

Upload and compress videos for YouTube publishing. Handles local compression via HandBrake and upload to Zernio storage.

Trejon-888

content-media

open

media

youtube-transcript

Extracts YouTube video transcripts and saves them as structured markdown files with metadata and timestamped content. When a user shares a YouTube URL, IMMEDIATELY runs the extraction script, creates a local folder, and saves the transcript. Handles both manual and auto-generated captions.