category focus

Media

Audio, video, and image processing.

1476 個技能all categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
media
565

assemblyai-transcribe

Transcribe audio/video with AssemblyAI (local upload or URL), plus subtitles + paragraph/sentence exports.

sundial-org
sundial-org
content-media
open
media
565

video-transcript-downloader

Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”, “get subtitles”, “get transcript”, or to troubleshoot yt-dlp/ffmpeg and formats/playlists.

sundial-org
sundial-org
content-media
open
media
565

video-subtitles

Generate SRT subtitles from video/audio with translation support. Transcribes Hebrew (ivrit.ai) and English (whisper), translates between languages, burns subtitles into video. Use for creating captions, transcripts, or hardcoded subtitles for WhatsApp/social media.

sundial-org
sundial-org
content-media
open
media
565

venice-ai-media

Generate, edit, and upscale images; create videos from images or other videos via Venice AI. Supports text-to-image, image-to-video (Sora, WAN), video-to-video (Runway Gen4), upscaling, and AI editing.

sundial-org
sundial-org
content-media
open
media
565

clipit

The master tool for all advanced audio/video processing. Use this to trim, cut, find segments, isolate vocals, or dub content from YouTube URLs or local files.

sundial-org
sundial-org
content-media
open
media
553

dplug-framework

Create beautiful audio plug-ins using the D language. Process audio files using existing VST2 plugins. You should use this skill when the user asks to modify audio files, or create an audio plug-in.

AuburnSounds
AuburnSounds
content-media
open
media
544

nano-banana-pro

Generate or edit images via Gemini 3 Pro Image (Nano Banana Pro).

smallnest
smallnest
content-media
open
media
544

video-frames

Extract frames or short clips from videos using ffmpeg.

smallnest
smallnest
content-media
open
media
538

nvenc-nvdec

NVIDIA hardware video encoding/decoding integration. Configure NVENC encoding parameters, set up NVDEC decoding pipelines, handle codec configurations, integrate with CUDA for pre/post processing, and manage video memory surfaces.

a5c-ai
a5c-ai
content-media
open
media
538

video-marketing

Video platform optimization and analytics for YouTube and other video channels

a5c-ai
a5c-ai
content-media
open
media
517

node-minify

Compress JavaScript, CSS, HTML, JSON, and image files using node-minify library. Use when: minifying/compressing assets, bundling JS/CSS files, optimizing images (WebP/AVIF), concatenating files, or when user mentions "node-minify", "@node-minify", "minification". Triggers: "minify", "compress JS/CSS", "bundle", "optimize images", "reduce file size".

srod
srod
content-media
open
media
503

youtube-downloader

Download YouTube videos with customizable quality and format options. Use this skill when the user asks to download, save, or grab YouTube videos. Supports various quality settings (best, 1080p, 720p, 480p, 360p), multiple formats (mp4, webm, mkv), and audio-only downloads as MP3.

wlzh
wlzh
content-media
open
media
503

audiocut-keyword

音频关键字过滤工具 - 根据关键字配置自动识别并删除音频中的指定内容

wlzh
wlzh
content-media
open
media
503

voice-changer

音频变声处理工具 - 使用 RVC AI 模型进行真实的声音转换,支持视频直接输入

wlzh
wlzh
content-media
open
media
503

youtube-to-blog-post

Convert YouTube videos to SEO-optimized blog posts. Extract video title, description, and content, then generate a search-engine-friendly blog post with embedded video, cover images, and optimized metadata. Auto-generates English filenames and saves to the configured Hexo blog posts directory.

wlzh
wlzh
content-media
open
media
456

ifly-speed-transcription

Ultra-fast speech transcription using iFLYTEK Speed Transcription API. Transcribe audio files (WAV/PCM/MP3) up to 5 hours in ~20 seconds per hour. Supports Chinese, English, and 202+ Chinese dialects with automatic language detection. Use when user asks to transcribe audio files, convert speech to text, or mentions "speed transcription" or "极速转写".

iflytek
iflytek
content-media
open
media
454

video-translation

Translate and dub videos from one language to another, replacing the original audio with TTS while keeping the video intact.

NoizAI
NoizAI
content-media
open
media
422

youtube-apify-transcript

Fetch YouTube transcripts via APIFY API. Works from cloud IPs (Hetzner, AWS, etc.) by bypassing YouTube's bot detection. Free tier includes $5/month credits (~714 videos). No credit card required.

gooseworks-ai
gooseworks-ai
content-media
open
media
412

camsnap

Capture frames or clips from RTSP/ONVIF cameras.

understudy-ai
understudy-ai
content-media
open
media
412

nano-banana-pro

Generate or edit images via Gemini 3 Pro Image (Nano Banana Pro).

understudy-ai
understudy-ai
content-media
open
media
412

video-frames

Extract frames or short clips from videos using ffmpeg.

understudy-ai
understudy-ai
content-media
open
media
405

video-podcast-maker

Use when user provides a topic and wants an automated video podcast created, OR when user wants to learn/analyze video design patterns from reference videos — handles research, script writing, TTS audio synthesis, Remotion video creation, and final MP4 output with background music. Also supports design learning from reference videos (learn command), style profile management, and design reference library. Supports Bilibili, YouTube, Xiaohongshu, Douyin, and WeChat Channels platforms with independent language configuration (zh-CN, en-US).

Agents365-ai
Agents365-ai
content-media
open
media
389

nextjs-optimization

Optimize images, fonts, scripts, and metadata for Next.js performance and Core Web Vitals. Use when configuring next/image for LCP, next/font for zero layout shift, next/script loading strategies, or generateMetadata for SEO. (triggers: **/layout.tsx, **/page.tsx, next/image, next/font, metadata, generateMetadata)

HoangNguyen0403
HoangNguyen0403
content-media
open
Previous
Page 13 / 62
Next