category focus

Media

Audio, video, and image processing.

1476 個技能all categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
media
915

video-to-gif

Video-to-GIF conversion skill with FFmpeg two-pass optimization - Brought to you by microsoft/hve-core

microsoft
microsoft
content-media
open
media
903

douyin-video

抖音无水印视频下载和文案提取工具. 从抖音分享链接获取无水印视频下载链接, 下载视频, 提取视频中的语音文案并自动保存到文件. 适用场景包括获取抖音视频信息, 下载无水印视频, 批量提取视频文案. 当用户需要处理抖音视频链接或提取视频内容时触发.

yzfly
yzfly
content-media
open
media
893

visual-audit

Perform adversarial visual audit of Quarto or Beamer slides checking for overflow, font consistency, box fatigue, and layout issues.

pedrohcgs
pedrohcgs
content-media
open
media
872

jianying-editor

剪映 (JianYing) AI自动化剪辑的高级封装 API (JyWrapper)。提供开箱即用的 Python 接口,支持录屏、素材导入、字幕生成、Web 动效合成及项目导出。

luoluoluo22
luoluoluo22
content-media
open
media
847

gemini-watermark-remover

Remove the visible Gemini AI watermark from images using reverse alpha blending. Use when asked to strip Gemini watermarks, batch-process Gemini images, or build/modify a CLI script that removes the bottom-right Gemini watermark without HTML or server-side components.

rookie-ricardo
rookie-ricardo
content-media
open
media
847

f8-editor-playerprefseditor-workflow

Use when working with PlayerPrefsEditor tools — viewing, editing, and managing PlayerPrefs data in the Unity Editor in F8Framework.

TippingGame
TippingGame
content-media
open
media
847

f8-features-audio-workflow

Use when implementing or troubleshooting Audio feature workflows — BGM, voice, SFX, 3D audio, volume control, and AudioMixer in F8Framework.

TippingGame
TippingGame
content-media
open
media
841

inno-figure-gen

Generate/edit images with Gemini image models (default: gemini-3.1-flash-image-preview). Use for image create/modify requests incl. edits. Supports text-to-image + image-to-image; 1K/2K/4K; use --input-image. Use --model to select a different model.

OpenLAIR
OpenLAIR
content-media
open
media
826

device-file-search

Locate and verify files on Android storage before upload/share, especially latest edited photos or videos.

pockebot
pockebot
content-media
open
media
826

capcut-edit

Edit videos and beautify photos in CapCut on mobile. Use when the user asks to open CapCut, create/edit/export videos, add text/effects/music/captions, use templates, or retouch and enhance photos.

pockebot
pockebot
content-media
open
media
819

mmx-cli

Use mmx to generate text, images, video, speech, and music via the MiniMax AI platform. Use when the user wants to create media content, chat with MiniMax models, perform web search, or manage MiniMax API resources from the terminal.

MiniMax-AI
MiniMax-AI
content-media
open
media
817

youtube-downloader

Download YouTube videos and HLS streams (m3u8) from platforms like Mux, Vimeo, etc. using yt-dlp and ffmpeg. Use this skill when users request downloading videos, extracting audio, handling protected streams with authentication headers, or troubleshooting download issues like nsig extraction failures, 403 errors, or cookie extraction problems.

daymade
daymade
content-media
open
media
817

video-comparer

This skill should be used when comparing two videos to analyze compression results or quality differences. Generates interactive HTML reports with quality metrics (PSNR, SSIM) and frame-by-frame visual comparisons. Triggers when users mention "compare videos", "video quality", "compression analysis", "before/after compression", or request quality assessment of compressed videos.

daymade
daymade
content-media
open
media
817

asr-transcribe-to-text

Transcribes audio and video files to text using Qwen3-ASR. Supports two modes — local MLX inference on macOS Apple Silicon (no API key, 15-27x realtime) and remote API via vLLM/OpenAI-compatible endpoints. Auto-detects platform and recommends the best path. Triggers when the user wants to transcribe recordings, convert audio/video to text, do speech-to-text, or mentions ASR, Qwen ASR, 转录, 语音转文字, 录音转文字. Also triggers for meeting recordings, lectures, interviews, podcasts, screen recordings, or any audio/video file the user wants converted to text.

daymade
daymade
content-media
open
media
813

image-crop-rotate

Image processing skill for cropping images to 50% from center and rotating them 90 degrees clockwise. This skill should be used when users request image cropping to center, image rotation, or both operations combined on image files.

instavm
instavm
content-media
open
media
811

webreel

Create and record scripted browser demo videos with webreel. Generates MP4, GIF, or WebM recordings with cursor animation, keystroke overlays, and sound effects from a JSON config. Use when the user wants to record a demo, create a browser video, edit a webreel config, generate a screen recording, preview a demo, or work with webreel in any way.

vercel-labs
vercel-labs
content-media
open
media
807

axiom-camera-capture-diag

camera freezes, preview rotated wrong, capture slow, session interrupted, black preview, front camera mirrored, camera not starting, AVCaptureSession errors, startRunning blocks, phone call interrupts camera

CharlesWiltgen
CharlesWiltgen
content-media
open
media
807

speech

Use when implementing speech-to-text, live transcription, or audio transcription. Covers SpeechAnalyzer (iOS 26+), SpeechTranscriber, volatile/finalized results, AssetInventory model management, audio format handling.

CharlesWiltgen
CharlesWiltgen
content-media
open
media
807

axiom-camera-capture-ref

Reference — AVCaptureSession, AVCapturePhotoSettings, AVCapturePhotoOutput, RotationCoordinator, photoQualityPrioritization, deferred processing, AVCaptureMovieFileOutput, session presets, capture device APIs

CharlesWiltgen
CharlesWiltgen
content-media
open
media
807

axiom-avfoundation-ref

Reference — AVFoundation audio APIs, AVAudioSession categories/modes, AVAudioEngine pipelines, bit-perfect DAC output, iOS 26+ spatial audio capture, ASAF/APAC, Audio Mix with Cinematic framework

CharlesWiltgen
CharlesWiltgen
content-media
open
media
807

axiom-camera-capture

AVCaptureSession, camera preview, photo capture, video recording, RotationCoordinator, session interruptions, deferred processing, capture responsiveness, zero-shutter-lag, photoQualityPrioritization, front camera mirroring

CharlesWiltgen
CharlesWiltgen
content-media
open
media
807

axiom-camera-capture-ref

Reference — AVCaptureSession, AVCapturePhotoSettings, AVCapturePhotoOutput, RotationCoordinator, photoQualityPrioritization, deferred processing, AVCaptureMovieFileOutput, session presets, capture device APIs

CharlesWiltgen
CharlesWiltgen
content-media
open
Previous
Page 11 / 62
Next