home/categories/content-media

domain cluster

Content & Media

CMS, document processing, and media generation.

7032টি স্কিলall categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

media

image-tools

CLI image manipulation — convert PNG/JPG to SVG, remove watermarks, resize, crop, and edit raster images using ImageMagick and vtracer

tta-lab

content-media

open

media

speech-to-text

Transcribe audio files to text using OpenAI Whisper CLI — supports voice messages, audio recordings, and multiple languages.

WalterSumbon

content-media

open

media

AI-assisted video editing workflows for cutting, structuring, and augmenting real footage. Covers the full pipeline from raw capture through FFmpeg, Remotion, ElevenLabs, fal.ai, and final polish in Descript or CapCut. Use when the user wants to edit video, cut footage, create vlogs, or build video content.

Jamkris

content-media

open

media

video-chapter-nav

视频章节导航条 - 为视频顶部添加章节导航，实时显示当前播放位置。

Leoyishou

content-media

open

media

api-asr

火山引擎语音识别 - 将音频/视频转文字，支持长音频分段识别。

Leoyishou

content-media

open

media

video-production

Orchestrate multi-clip AI video projects — style anchors, chaining patterns, frame-level QA, montage assembly. Not for video analysis, research, provider settings, or FFmpeg encoding.

Galbaz1

content-media

open

media

image-generation

Enhances image generation prompts with Subject-Context-Style structure, style anchors, character consistency, mcp-image workflows. Not for video generation, TTS, FFmpeg, audio, or design-to-code.

Galbaz1

content-media

open

media

ffmpeg-production

FFmpeg video/audio processing — conversion, scaling, compression, trimming, concatenation, AI post-processing. Not for audio ducking/voice mixing (tts-production) or Remotion rendering.

Galbaz1

content-media

open

media

video-downloader-skill

Downloads videos and audio from YouTube, Bilibili, Twitter, and other platforms using yt-dlp. Supports quality selection, format conversion, and audio extraction.

Leoyishou

content-media

open

media

image-rotator

This skill should be used when users need to rotate images by 90 degrees. It handles image rotation tasks for common formats (PNG, JPG, JPEG, GIF, BMP, TIFF) using a reliable Python script that preserves image quality and supports both clockwise and counter-clockwise rotation.

amkessler

content-media

open

media

youtube-downloader

Download YouTube videos with customizable quality and format options. Use this skill when the user asks to download, save, or grab YouTube videos. Supports various quality settings (best, 1080p, 720p, 480p, 360p), multiple formats (mp4, webm, mkv), and audio-only downloads as MP3.

senweaver

content-media

open

media

nano-banana-pro

Generate or edit images via Gemini 3 Pro Image (Nano Banana Pro).

senweaver

content-media

open

media

video-frames

Extract frames or short clips from videos using ffmpeg.

senweaver

content-media

open

media

audio-transcriber

Transcribe audio and video files to text using OpenAI Whisper

scalyclaw

content-media

open

media

ffmpeg

Powerful multimedia processing tool for converting, recording, and streaming audio and video. Core Scenario: When the user needs to convert media formats, extract audio, or perform complex video editing via CLI.

x-cmd

content-media

open

media

digital-human

使用火山引擎OmniHuman1.5生成数字人视频,输入IP形象图片+音频,输出数字人说话视频。

Leoyishou

content-media

open

media

show-gallery

Universal media gallery — browse images/videos from any local folder with copy-path, enlarge, and video playback. Reusable across all gen projects.

ThepExcel

content-media

open

media

fal-ai-media

Unified media generation via fal.ai MCP — image, video, and audio. Covers text-to-image (Nano Banana), text/image-to-video (Seedance, Kling, Veo 3), text-to-speech (CSM-1B), and video-to-audio (ThinkSound). Use when the user wants to generate images, videos, or audio with AI.

Jamkris

content-media

open

media

minimax-multimodal-toolkit

MiniMax multimodal model skill — use MiniMax Multi-Modal models for speech, music, video, and image. Create voice, music, video, and images with MiniMax AI: TTS (text-to-speech, voice cloning, voice design, multi-segment), music (songs, instrumentals), video (text-to-video, image-to-video, start-end frame, subject reference, templates, long-form multi-scene), image (text-to-image, image-to-image with character reference), and media processing (convert, concat, trim, extract). Use when the user mentions MiniMax, multimodal generation, or wants speech/music/video/image AI, MiniMax APIs, or FFmpeg workflows alongside MiniMax outputs.

x-cmd

content-media

open

media

camsnap

Capture frames or clips from RTSP/ONVIF cameras.

senweaver

content-media

open

media

visual-inspection

Capture and understand camera images using the robot's head camera and VLM.

syswonder

content-media

open

media

youtube-transcript

Get video transcripts

scalyclaw

content-media

open

media

image-processor

Resize, crop, rotate, convert, watermark images

scalyclaw

content-media

open

content-creation

audience-intelligence

Analyzes target audience demographics, psychographics, behaviors, and platform preferences to inform influencer selection and campaign strategy. Essential foundation for effective influencer marketing.

vuralserhat86

content-media

open

Page 187 / 293