category focus

Media

Audio, video, and image processing.

1476 skillsall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
media
381

pdf-to-video

Use when user wants to convert a PDF document into a showcase video, extract key points from PDF, or create video presentation from PDF file

DangJin
DangJin
content-media
open
media
377

aliyun-wan-videoedit

Use when editing videos with DashScope Wan 2.7 video editing model (wan2.7-videoedit). Use when implementing video style transfer, instruction-based video editing with optional reference images, or video content modification via the video-synthesis async API.

cinience
cinience
content-media
open
media
377

aliyun-videoretalk

Use when replacing lip sync in existing videos with Alibaba Cloud Model Studio VideoRetalk (`videoretalk`). Use when creating dubbed videos, replacing narration, or synchronizing a talking-head video to a new speech track.

cinience
cinience
content-media
open
media
377

aliyun-wan-i2v

Use when generating videos from images with DashScope Wan 2.7 image-to-video model (wan2.7-i2v). Use when implementing first-frame video generation, first+last frame interpolation, video continuation, or audio-driven video synthesis via the video-synthesis async API.

cinience
cinience
content-media
open
media
377

aliyun-kling-video

Use when generating videos with Kling v3 models on DashScope (kling/kling-v3-video-generation, kling/kling-v3-omni-video-generation). Use when implementing text-to-video, image-to-video, reference-to-video, smart storyboard, or video editing via the video-synthesis async API.

cinience
cinience
content-media
open
media
377

aliyun-video-style-repaint

Use when transforming video style with DashScope video-style-transform model. Use when converting videos to artistic styles such as Japanese manga, American comics, 3D cartoon, Chinese ink painting, paper art, or simple illustration via the video-synthesis async API.

cinience
cinience
content-media
open
media
377

aliyun-qwen-asr

Use when transcribing non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.

cinience
cinience
content-media
open
media
365

videocut

执行视频剪辑。根据确认的删除任务执行FFmpeg剪辑,循环直到零口误,生成字幕。触发词:执行剪辑、开始剪、确认剪辑

Ceeon
Ceeon
content-media
open
media
359

media-downloader

智能媒体下载器。根据用户描述自动搜索和下载图片、视频片段,支持视频自动剪辑。 Smart media downloader. Automatically search and download images/video clips based on user description, with auto-trimming support. 触发方式 Triggers: "下载图片", "找视频", "download media", "download images", "find video", "/media"

yizhiyanhua-ai
yizhiyanhua-ai
content-media
open
media
349

media

Play audio files, record from microphone, take photos. Use for media playback, voice recording, or camera capture.

mikeyobrien
mikeyobrien
content-media
open
media
336

transcribe-audio

Transcribes video audio using WhisperX, preserving original timestamps. Creates JSON transcript with word-level timing. Use when you need to generate audio transcripts for videos.

barefootford
barefootford
content-media
open
media
336

roughcut

Creates video rough cut yaml file for use with Buttercut gem. Concatenates visual transcripts with file markers, creates a roughcut yaml with clip selections, then exports to XML format. Use this skill when users want a "roughcut", "sequence" or "scene" generated. These are all the same thing, just with different lengths.

barefootford
barefootford
content-media
open
media
336

analyze-video

Adds visual descriptions to transcripts by extracting and analyzing video frames with ffmpeg. Creates visual transcript with periodic visual descriptions of the video clip. Use when all files have audio transcripts present (transcript) but don't yet have visual transcripts created (visual_transcript).

barefootford
barefootford
content-media
open
media
328

gsap-animation

GSAP + Remotion integration for professional motion graphics video production. Timeline orchestration, text splitting, SVG morphing, advanced easing, and reusable effect presets.

notedit
notedit
content-media
open
media
325

video-optimization

When the user wants to optimize videos for Google Search, video sitemap, VideoObject schema, or video SEO on websites. Also use when the user mentions "video SEO," "video sitemap," "VideoObject," "video thumbnail," "video indexing," "video preview," "key moments," "Clip schema," or "embedded video optimization." For page template, use article-page-generator.

kostja94
kostja94
content-media
open
media
325

youtube-seo

When the user wants to optimize YouTube videos for search, create video descriptions, or improve channel visibility. Also use when the user mentions "YouTube SEO," "YouTube description," "YouTube tags," "YouTube thumbnail," "YouTube title," "YouTube channel," or "video SEO." For YouTube ads, use youtube-ads.

kostja94
kostja94
content-media
open
media
325

image-optimization

When the user wants to optimize images for search engines and performance. Also use when the user mentions "image SEO," "alt text," "image captions," "figcaption," "image optimization," "WebP," "lazy loading," "LCP," "image sitemap," "responsive images," "srcset," "image format," or "hero image optimization." For CWV, use core-web-vitals.

kostja94
kostja94
content-media
open
media
322

image-to-video

Still-to-video conversion guide: model selection, motion prompting, and camera movement. Covers Wan 2.5 i2v, Seedance, Fabric, Grok Video with when to use each. Use for: animating images, creating video from stills, adding motion, product animations. Triggers: image to video, i2v, animate image, still to video, add motion to image, image animation, photo to video, animate still, wan i2v, image2video, bring image to life, animate photo, motion from image

inference-sh
inference-sh
content-media
open
media
322

p-video

Generate videos with Pruna P-Video and WAN models via inference.sh CLI. Models: P-Video, WAN-T2V, WAN-I2V. Capabilities: text-to-video, image-to-video, audio support, 720p/1080p, fast inference. Pruna optimizes models for speed without quality loss. Triggers: pruna video, p-video, pruna ai video, fast video generation, optimized video, wan t2v, wan i2v, economic video generation, cheap video generation, pruna text to video, pruna image to video

inference-sh
inference-sh
content-media
open
media
319

video-editing

Video editing pipeline: cut footage, assemble clips via FFmpeg and Remotion.

notque
notque
content-media
open
media
319

image-to-video

FFmpeg-based video creation from image and audio.

notque
notque
content-media
open
media
304

baoyu-compress-image

Compresses images to WebP (default) or PNG with automatic tool selection. Use when user asks to "compress image", "optimize image", "convert to webp", or reduce image file size.

ECNU-ICALK
ECNU-ICALK
content-media
open
media
304

audio-transcription

Transcribe audio and video files into structured notes. Activate this skill when users want to transcribe recordings, meetings, podcasts, voice memos, or any audio/video content in their vault.

allenhutchison
allenhutchison
content-media
open
Previous
Page 14 / 62
Next