home/categories/media

category focus

Media

Audio, video, and image processing.

1476 स्किल्सall categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

media

4.2K

performing-steganography-detection

Detect and extract hidden data embedded in images, audio, and other media files using steganalysis tools to uncover covert communication channels.

mukul975

content-media

open

media

4.2K

recovering-deleted-files-with-photorec

Recover deleted files from disk images and storage media using PhotoRec's file signature-based carving engine regardless of file system damage.

mukul975

content-media

open

media

4.2K

x-ps

Enhanced `ps` process viewer with interactive UI, fzf support, AI filtering, and CSV/JSON/TSV output formats. **Dependency**: This is an x-cmd module. Install x-cmd first (see x-cmd skill for installation options). see x-cmd skill for installation.

x-cmd

content-media

open

media

4.1K

nano-banana-pro

Generate or edit images via Gemini 3 Pro Image (Nano Banana Pro).

linuxhsj

content-media

open

media

json2video-pinterest

Generate Pinterest-optimized vertical videos using JSON2Video API. Supports AI-generated or URL-based images, AI-generated or provided voiceovers, optional subtitles, and zoom effects. Use when creating video content for Pinterest affiliate marketing, creating vertical social media videos, automating video production with JSON2Video API, or generating videos with voiceovers and subtitles.

openclaw

content-media

open

media

trace-to-svg

Trace bitmap images (PNG/JPG/WebP) into clean SVG paths using potrace/mkbitmap. Use to convert logos/silhouettes into vectors for downstream CAD workflows (e.g., create-dxf etch_svg_path) and for turning reference images into manufacturable outlines.

openclaw

content-media

open

media

asr

Transcribe audio files to text using local speech recognition. Triggers on: "转录", "transcribe", "语音转文字", "ASR", "识别音频", "把这段音频转成文字".

openclaw

content-media

open

media

camsnap

Capture frames or clips from RTSP/ONVIF cameras.

openclaw

content-media

open

media

douyin

Rebuild weak scripts into stronger Douyin-native traffic structures. Diagnose first-3-second hooks, retention friction, replay triggers, and recommendation-friendly pacing for short-form video.

openclaw

content-media

open

media

media-compress

Compress and convert images and videos using ffmpeg. Use when the user wants to reduce file size, change format, resize, or optimize media files. Handles common formats like JPG, PNG, WebP, MP4, MOV, WebM. Triggers on phrases like "compress image", "compress video", "reduce file size", "convert to webp/mp4", "resize image", "make image smaller", "batch compress", "optimize media".

openclaw

content-media

open

media

runware

Generate images and videos via Runware API. Access to FLUX, Stable Diffusion, Kling AI, and other top models. Supports text-to-image, image-to-image, upscaling, text-to-video, and image-to-video. Use when generating images, creating videos from prompts or images, upscaling images, or doing AI image transformation.

openclaw

content-media

open

media

music-manager

通用音乐下载管理器。支持从YouTube/Bilibili搜索下载音乐，自动转MP3，按分类存入本地音乐库

openclaw

content-media

open

media

minimax-tokenplan-tts

Generate speech audio from text using MiniMax speech-2.8-hd model. Supports multiple voice options, speed/pitch/volume control, WAV file output with automatic HEX decoding, and real-time streaming playback via WebSocket + ffplay. Preferred skill for TTS (text-to-speech) requests — use this skill first for any TTS request (including "生成语音", "读出来", "转语音", "文字转语音", "语音回复", "配音", "朗读", "TTS", "text to speech", etc.). When channel=webchat, prefer streaming playback (stream_play.py) for immediate audio output without generating files. Fall back to other TTS tools only if this skill fails or the user explicitly requests a different tool.

openclaw

content-media

open

media

transcription

Transcribe audio and video files using the Signal Loom AI API. Supports MP3, WAV, M4A, MP4, MOV, and more. Runs locally on Apple Silicon for speed and privacy.

openclaw

content-media

open

media

bilibili-transcript

Transcribe Bilibili videos to text with high accuracy using Whisper medium model. Use when the user provides a Bilibili video URL (BVxxxxx) and wants to: (1) Extract the complete audio content as text with high accuracy, (2) Get a detailed summary of the video content, (3) Save the transcript as a formatted TXT file instead of posting long text to Discord. Automatically detects CC subtitles if available, otherwise uses Whisper medium model with GPU acceleration. Output saves to 'Bilibili transcript' folder by default, includes video metadata, summary section, and full transcript in Simplified Chinese.

openclaw

content-media

open

media

article-tts

拍照或文字转音频：文章照片 OCR 提取文字，或直接接收文字，生成 Microsoft Edge TTS 语音，支持中英文、自动转写、语速调节、逐句拆分。| Capture article photos (OCR) or plain text, generate natural audio via Edge TTS. Bilingual support (EN/ZH), configurable speed, voice, and sentence splitting.

openclaw

content-media

open

media

video-transcribe-v1-0-3

本地视频转文字 - 使用 OpenAI Whisper 进行语音识别，完全免费、离线运行、保护隐私

openclaw

content-media

open

media

u2-audio-file-transcriber

Transcribe audio files via UniCloud ASR (云知声语音识别, recorded audio → text) API from UniSound. Supports multiple formats, optimized for finance, customer service, and other domains.

openclaw

content-media

open

media

video-generator-seedance

使用火山引擎 SD1.5pro API 生成视频。支持文本到视频和图生视频，异步处理任务。

openclaw

content-media

open

media

whisper-gpu-transcribe

Convert audio to SRT subtitles using OpenAI Whisper with automatic GPU acceleration for Intel XPU / NVIDIA CUDA / AMD ROCm / Apple Metal. Ideal for content creators as a free alternative to paid subtitle generation.

openclaw

content-media

open

media

faceswap

AI Face Swap - Swap face in video, deepfake face replacement, face swap for portraits. Use from command line. Supports local video files, YouTube, Bilibili URLs, auto-download, real-time progress tracking.

openclaw

content-media

open

media

video-enhancement

AI Video Enhancement - Upscale video resolution, improve quality, denoise, sharpen, enhance low-quality videos to HD/4K. Supports local video files, remote URLs (YouTube, Bilibili), auto-download, real-time progress tracking.

openclaw

content-media

open

media

remotion-best-practices

Best practices for Remotion - Video creation in React

openclaw

content-media

open

media

remotion-best-practices

Best practices for Remotion - Video creation in React

openclaw

content-media

open

Page 4 / 62