category focus

Media

Audio, video, and image processing.

1476 اسکلزall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
media
33

video-url-transcriber

Transcribe a video/audio URL into timestamped JSON using yt-dlp + ffmpeg + faster-whisper. Use when an agent needs platform-agnostic URL-to-transcript ingestion for downstream analysis.

nativ3ai
nativ3ai
content-media
open
media
33

speech

Speech recognition: SFSpeechRecognizer, live and file-based recognition, permissions. Use when implementing app features related to speech.

moasq
moasq
content-media
open
media
33

media

Comprehensive media patterns: video playback, audio/music players, HLS streaming, remote images, AVAudioSession, memory management. Use when implementing any media-related features.

moasq
moasq
content-media
open
media
33

voice-inbox

Transcripción de audio y flujo audio→texto→acción para mensajes de voz

gonzalezpazmonica
gonzalezpazmonica
content-media
open
media
33

youtube-poop-video-maker

Use this skill when the user wants a short YouTube poop, cursed trailer, glitch-poetry montage, absurd supercut, reflective meme edit, or FFmpeg-rendered remix video from text, webpages, code, documents, media, or from scratch. Activate for prompts like “make this weirder,” “give it a personal spin,” “what it feels like,” “render with ffmpeg,” or self-aware AI / LLM montage requests. The skill plans and renders a dense, aesthetically pleasing 20–60 second video with many micro-scenes, readable typography, restrained neon or analog treatments, controlled audio, optional TTS fragments, and a default seeded-remix blueprint that samples the bundled styles at runtime.

DenisSergeevitch
DenisSergeevitch
content-media
open
media
33

youtube-to-bookplayer

Download YouTube audio and push to BookPlayer on iPhone via USB. TRIGGERS - youtube audio, bookplayer, download youtube, push to iphone, youtube to bookplayer, audiobook from youtube, youtube bookplayer

terrylica
terrylica
content-media
open
media
33

send-media

Use when user wants to send or upload a file, photo, video, voice note, or document on Telegram via their personal account.

terrylica
terrylica
content-media
open
media
33

format

Reference for asciinema v3 .cast NDJSON format. TRIGGERS - cast format, asciicast spec, event codes.

terrylica
terrylica
content-media
open
media
33

iptorrents

Download a movie, TV show, or any media from IPTorrents. Use this skill when Evan says something like "download this movie", "get me this show", "download on IPTorrents", or "find and download <title>".

evanpurkhiser
evanpurkhiser
content-media
open
media
33

play-youtube-on-tv

Play a YouTube video or URL on Evan's Apple TV via Home Assistant. Use this skill when asked to play, cast, or put a YouTube video on the TV.

evanpurkhiser
evanpurkhiser
content-media
open
media
33

finalize

Finalize orphaned recordings - stop processes, compress, push to orphan branch. TRIGGERS - finalize recording, stop asciinema, orphaned recording, cleanup recording, push recording.

terrylica
terrylica
content-media
open
media
32

rhythm-pacing

Use when animation needs musical flow—dance sequences, action choreography, comedic timing, scene pacing, or any motion that should feel rhythmic and well-composed over time.

dylantarre
dylantarre
content-media
open
media
32

video-motion-graphics

Use when creating After Effects compositions, Premiere Pro motion, video titles, explainer videos, or broadcast motion graphics.

dylantarre
dylantarre
content-media
open
media
32

video-frames

Extract frames or short clips from videos using ffmpeg.

malue-ai
malue-ai
content-media
open
media
32

everything-search

Blazing-fast full-disk file search on Windows using Everything by voidtools. Millisecond search across millions of files with regex, wildcards, and advanced filters.

malue-ai
malue-ai
content-media
open
media
32

camsnap

Capture frames or clips from RTSP/ONVIF cameras.

malue-ai
malue-ai
content-media
open
media
32

image-resize

Resize, convert, and batch-process images using ImageMagick.

malue-ai
malue-ai
content-media
open
media
32

nano-banana-pro

通过 Gemini 3 Pro Image (Nano Banana Pro) 生成或编辑图像。

malue-ai
malue-ai
content-media
open
media
32

ponyflash

Generate images, videos, speech audio, and music using the PonyFlash Python SDK. Also handle local media editing with FFmpeg, including clip, concat, transcode, extract audio, frame capture, subtitle capability checks, and ASS subtitle prep. Use when the user asks to create, generate, produce, edit, trim, merge, concatenate, transcode, subtitle, or render AI-generated media content.

ponyflash
ponyflash
content-media
open
media
32

youtube

Download content from YouTube including transcripts, captions, subtitles, music, MP3s, and playlists. Use when the user provides a YouTube URL or asks to download, transcribe, or get content from YouTube videos or playlists.

steveclarke
steveclarke
content-media
open
media
32

carocut-media-audio

音频素材生成与获取。批量 Edge TTS 旁白生成(支持 storyboard pacing 字段驱动语速)、BGM/SFX 检索(BGM 节奏匹配 BPM 规则)、音频时长提取。包含 Edge voice 配置、速度调整规则、durations.json 格式规范(含 audio_visual_relation 说明)和关键的音频时序规则。

bilibili
bilibili
content-media
open
media
31

video-processing

Guide for video analysis and frame-level event detection tasks using OpenCV and similar libraries. This skill should be used when detecting events in videos (jumps, movements, gestures), extracting frames, analyzing motion patterns, or implementing computer vision algorithms on video data. It provides verification strategies and helps avoid common pitfalls in video processing workflows.

letta-ai
letta-ai
content-media
open
media
31

video-processing

This skill provides guidance for video analysis and processing tasks using computer vision techniques. It should be used when analyzing video frames, detecting motion or events, tracking objects, extracting temporal data (e.g., identifying specific frames like takeoff/landing moments), or performing frame-by-frame processing with OpenCV or similar libraries.

letta-ai
letta-ai
content-media
open
media
31

reshard-c4-data

Guide for implementing reversible data resharding systems with hierarchical constraints (max files/folders per directory, max file size). Use when building compress/decompress scripts that reorganize datasets while maintaining full reconstruction capability.

letta-ai
letta-ai
content-media
open
Previous
Page 35 / 62
Next