home/categories/media

category focus

Media

Audio, video, and image processing.

1476 스킬all categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

media

video-url-transcriber

Transcribe a video/audio URL into timestamped JSON using yt-dlp + ffmpeg + faster-whisper. Use when an agent needs platform-agnostic URL-to-transcript ingestion for downstream analysis.

nativ3ai

content-media

open

media

speech

Speech recognition: SFSpeechRecognizer, live and file-based recognition, permissions. Use when implementing app features related to speech.

moasq

content-media

open

media

Comprehensive media patterns: video playback, audio/music players, HLS streaming, remote images, AVAudioSession, memory management. Use when implementing any media-related features.

moasq

content-media

open

media

voice-inbox

Transcripción de audio y flujo audio→texto→acción para mensajes de voz

gonzalezpazmonica

content-media

open

media

Use this skill when the user wants a short YouTube poop, cursed trailer, glitch-poetry montage, absurd supercut, reflective meme edit, or FFmpeg-rendered remix video from text, webpages, code, documents, media, or from scratch. Activate for prompts like “make this weirder,” “give it a personal spin,” “what it feels like,” “render with ffmpeg,” or self-aware AI / LLM montage requests. The skill plans and renders a dense, aesthetically pleasing 20–60 second video with many micro-scenes, readable typography, restrained neon or analog treatments, controlled audio, optional TTS fragments, and a default seeded-remix blueprint that samples the bundled styles at runtime.

DenisSergeevitch

content-media

open

media

youtube-to-bookplayer

Download YouTube audio and push to BookPlayer on iPhone via USB. TRIGGERS - youtube audio, bookplayer, download youtube, push to iphone, youtube to bookplayer, audiobook from youtube, youtube bookplayer

terrylica

content-media

open

media

send-media

Use when user wants to send or upload a file, photo, video, voice note, or document on Telegram via their personal account.

terrylica

content-media

open

media

format

Reference for asciinema v3 .cast NDJSON format. TRIGGERS - cast format, asciicast spec, event codes.

terrylica

content-media

open

media

iptorrents

Download a movie, TV show, or any media from IPTorrents. Use this skill when Evan says something like "download this movie", "get me this show", "download on IPTorrents", or "find and download <title>".

evanpurkhiser

content-media

open

media

play-youtube-on-tv

Play a YouTube video or URL on Evan's Apple TV via Home Assistant. Use this skill when asked to play, cast, or put a YouTube video on the TV.

evanpurkhiser

content-media

open

media

finalize

Finalize orphaned recordings - stop processes, compress, push to orphan branch. TRIGGERS - finalize recording, stop asciinema, orphaned recording, cleanup recording, push recording.

terrylica

content-media

open

media

rhythm-pacing

Use when animation needs musical flow—dance sequences, action choreography, comedic timing, scene pacing, or any motion that should feel rhythmic and well-composed over time.

dylantarre

content-media

open

media

video-motion-graphics

Use when creating After Effects compositions, Premiere Pro motion, video titles, explainer videos, or broadcast motion graphics.

dylantarre

content-media

open

media

video-frames

Extract frames or short clips from videos using ffmpeg.

malue-ai

content-media

open

media

everything-search

Blazing-fast full-disk file search on Windows using Everything by voidtools. Millisecond search across millions of files with regex, wildcards, and advanced filters.

malue-ai

content-media

open

media

camsnap

Capture frames or clips from RTSP/ONVIF cameras.

malue-ai

content-media

open

media

image-resize

Resize, convert, and batch-process images using ImageMagick.

malue-ai

content-media

open

media

nano-banana-pro

通过 Gemini 3 Pro Image (Nano Banana Pro) 生成或编辑图像。

malue-ai

content-media

open

media

ponyflash

Generate images, videos, speech audio, and music using the PonyFlash Python SDK. Also handle local media editing with FFmpeg, including clip, concat, transcode, extract audio, frame capture, subtitle capability checks, and ASS subtitle prep. Use when the user asks to create, generate, produce, edit, trim, merge, concatenate, transcode, subtitle, or render AI-generated media content.

ponyflash

content-media

open

media

youtube

Download content from YouTube including transcripts, captions, subtitles, music, MP3s, and playlists. Use when the user provides a YouTube URL or asks to download, transcribe, or get content from YouTube videos or playlists.

steveclarke

content-media

open

media

carocut-media-audio

音频素材生成与获取。批量 Edge TTS 旁白生成（支持 storyboard pacing 字段驱动语速）、BGM/SFX 检索（BGM 节奏匹配 BPM 规则）、音频时长提取。包含 Edge voice 配置、速度调整规则、durations.json 格式规范（含 audio_visual_relation 说明）和关键的音频时序规则。

bilibili

content-media

open

media

video-processing

Guide for video analysis and frame-level event detection tasks using OpenCV and similar libraries. This skill should be used when detecting events in videos (jumps, movements, gestures), extracting frames, analyzing motion patterns, or implementing computer vision algorithms on video data. It provides verification strategies and helps avoid common pitfalls in video processing workflows.

letta-ai

content-media

open

media

video-processing

This skill provides guidance for video analysis and processing tasks using computer vision techniques. It should be used when analyzing video frames, detecting motion or events, tracking objects, extracting temporal data (e.g., identifying specific frames like takeoff/landing moments), or performing frame-by-frame processing with OpenCV or similar libraries.

letta-ai

content-media

open

media

reshard-c4-data

Guide for implementing reversible data resharding systems with hierarchical constraints (max files/folders per directory, max file size). Use when building compress/decompress scripts that reorganize datasets while maintaining full reconstruction capability.

letta-ai

content-media

open

Page 35 / 62