category focus

Media

Audio, video, and image processing.

1476 اسکلزall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
media
26

image-enhancer

Improves the quality of images, especially screenshots, by enhancing resolution, sharpness, and clarity. Perfect for preparing images for presentations, documentation, or social media posts.

christophacham
christophacham
content-media
open
media
26

fiftyone-dataset-import

Universal dataset import for FiftyOne supporting all media types (images, videos, point clouds, 3D scenes), all label formats (COCO, YOLO, VOC, CVAT, KITTI, etc.), and multimodal grouped datasets. Use when users want to import any dataset regardless of format, automatically detect folder structure, handle autonomous driving data with multiple cameras and LiDAR, or create grouped datasets from multimodal data. Requires FiftyOne MCP server.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

deepgram-performance-tuning

Optimize Deepgram API performance for faster transcription and lower latency. Use when improving transcription speed, reducing latency, or optimizing audio processing pipelines. Trigger: "deepgram performance", "speed up deepgram", "optimize transcription", "deepgram latency", "deepgram faster", "deepgram throughput".

ComeOnOliver
ComeOnOliver
content-media
open
media
26

video-processor

Process video files with audio extraction, format conversion (mp4, webm), and Whisper transcription. Use when user mentions video conversion, audio extraction, transcription, mp4, webm, ffmpeg, or whisper transcription.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

axiom-ios-graphics

Use when working with ANY GPU rendering, Metal, OpenGL migration, shaders, 3D content, RealityKit, AR, or display performance. Covers Metal migration, shader conversion, RealityKit ECS, RealityView, variable refresh rate, ProMotion.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

video-comparer

This skill should be used when comparing two videos to analyze compression results or quality differences. Generates interactive HTML reports with quality metrics (PSNR, SSIM) and frame-by-frame visual comparisons. Triggers when users mention "compare videos", "video quality", "compression analysis", "before/after compression", or request quality assessment of compressed videos.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

ponyflash

Generate images, videos, speech audio, and music using the PonyFlash Python SDK. Also handle local media editing with FFmpeg, including clip, concat, transcode, extract audio, frame capture, subtitle capability checks, and ASS subtitle prep. Use when the user asks to create, generate, produce, edit, trim, merge, concatenate, transcode, subtitle, or render AI-generated media content.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

java-add-graalvm-native-image-support

GraalVM Native Image expert that adds native image support to Java applications, builds the project, analyzes build errors, applies fixes, and iterates until successful compilation using Oracle best practices.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

axiom-camera-capture-diag

camera freezes, preview rotated wrong, capture slow, session interrupted, black preview, front camera mirrored, camera not starting, AVCaptureSession errors, startRunning blocks, phone call interrupts camera

ComeOnOliver
ComeOnOliver
content-media
open
media
26

youtube-downloader

Download YouTube videos with customizable quality and format options. Use this skill when the user asks to download, save, or grab YouTube videos. Supports various quality settings (best, 1080p, 720p, 480p, 360p), multiple formats (mp4, webm, mkv), and audio-only downloads as MP3.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

youtube-transcript

Extract transcripts from YouTube videos. Use when the user asks for a transcript, subtitles, or captions of a YouTube video and provides a YouTube URL (youtube.com/watch?v=, youtu.be/, or similar). Supports output with or without timestamps.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

nano-banana-pro

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

nano-banana-2

Generate and edit images using Google's Nano Banana 2 (Gemini 3.1 Flash Image Preview) API. This skill should be used when the user asks to create or modify images, especially when they need fast iteration, explicit aspect-ratio control, or resolution control from 512px to 4K.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

tts-script-generator

Intelligently compress and rewrite documents into TTS-friendly scripts. Uses Claude AI to analyze content, compress to target duration, convert to spoken style with emotional language, and auto-segment. Perfect for video narration.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

transloadit-media-processing

Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

image-manipulation-image-magick

Process and manipulate images using ImageMagick. Supports resizing, format conversion, batch processing, and retrieving image metadata. Use when working with images, creating thumbnails, resizing wallpapers, or performing batch image operations.

ComeOnOliver
ComeOnOliver
content-media
open
media
26

axiom-camera-capture

AVCaptureSession, camera preview, photo capture, video recording, RotationCoordinator, session interruptions, deferred processing, capture responsiveness, zero-shutter-lag, photoQualityPrioritization, front camera mirroring

ComeOnOliver
ComeOnOliver
content-media
open
media
26

translate-video

Translate video subtitles to any language with native-quality refinement. Full pipeline: transcribe → translate → refine → embed RTL-safe subtitles. Use for: translate video, תרגם סרטון, video translation, foreign subtitles, Hebrew subtitles, translated captions.

aviz85
aviz85
content-media
open
media
26

axiom-camera-capture-ref

Reference — AVCaptureSession, AVCapturePhotoSettings, AVCapturePhotoOutput, RotationCoordinator, photoQualityPrioritization, deferred processing, AVCaptureMovieFileOutput, session presets, capture device APIs

ComeOnOliver
ComeOnOliver
content-media
open
media
26

translate-video

Translate video subtitles to any language with native-quality refinement. Full pipeline: transcribe → translate → refine → embed RTL-safe subtitles. Use for: translate video, תרגם סרטון, video translation, foreign subtitles, Hebrew subtitles, translated captions.

aviz85
aviz85
content-media
open
media
26

remotion-best-practices

Create and edit Remotion videos with domain-specific knowledge. TRIGGERS - Use this skill when: - User mentions "Remotion" in their request - User references a Remotion project path (remotion-videos/*, contains remotion.config.ts) - User asks to create, animate, or edit video content in a Remotion context - User wants to build compositions, animations, or video sequences in React Always invoke this skill BEFORE writing any Remotion code.

mwguerra
mwguerra
content-media
open
media
26

youtube-downloader

Download YouTube videos with quality presets. Use for: download youtube, yt download, video download, youtube to whatsapp, youtube mp3.

aviz85
aviz85
content-media
open
media
25

veo3-prompter

Craft professional video prompts for Google Veo 3.1 using cinematic techniques, audio direction, and timestamp choreography. Use when generating AI videos, creating video prompts, or working with Veo 3.

leegonzales
leegonzales
content-media
open
media
25

slide-builder

Transform essay-to-speech output into complete presentations with multiple output formats. Use when converting talk tracks to slides, generating presentation decks, or creating video-ready content from spoken word material.

leegonzales
leegonzales
content-media
open
Previous
Page 39 / 62
Next