youtube-playlist-to-mp3-downloader-with-metadata
Generates a Python script to download YouTube playlists as MP3 files, including video thumbnails and artist metadata tags.
Generates a Python script to download YouTube playlists as MP3 files, including video thumbnails and artist metadata tags.
Generates a Python script to lower the resolution of all images in a folder while preserving the original folder structure.
Cleans video or audio transcripts by removing timestamp markers while preserving all spoken content.
Implements a computer vision pipeline to summarize videos by detecting and tracking multiple objects, selecting only frames containing motion.
Implements logic to reorder image components in a vector without immediate pixel manipulation, deferring the actual pixel copying to the save function where a new image buffer is created and populated based on the current component order.
Presentation creation, editing, and analysis. When Claude needs to work with presentations (.pptx files) for: (1) Creating new presentations, (2) Modifying or editing content, (3) Working with layouts, (4) Adding comments or speaker notes, or any other presentation tasks
Batch content grabber — bulk fetch bookmarks, user tweets, search results, author notes, wiki pages, and more from X/Twitter, Xiaohongshu, WeChat, YouTube, Feishu. Use when user wants to batch/bulk fetch, search keywords, or grab all posts from an account.
Video & Podcast Digest — send a video/podcast link, get full transcript + structured summary. Supports YouTube, Bilibili, X/Twitter video, Xiaoyuzhou, Apple Podcasts, and direct audio/video links. Uses yt-dlp for subtitles and Groq Whisper for transcription.
Toolkit for creating animated GIFs optimized for Slack, with validators for size constraints and composable animation primitives. This skill applies when users request animated GIFs or emoji animations for Slack from descriptions like "make me a GIF for Slack of X doing Y".
Generate web assets including favicons, app icons (PWA), and social media meta images (Open Graph) for Facebook, Twitter, WhatsApp, and LinkedIn. Use when users need icons, favicons, social sharing images, or Open Graph images from logos or text slogans. Handles image resizing, text-to-image generation, and provides proper HTML meta tags.
根据文字描述生成视频,一个生成图片和视频的工作流技能。依赖 skills: byted-web-search, image-generate, video-generate。注意:此 workflow 没有执行脚本,只是一个描述性的文档。
Generate videos using Seedance models. Invoke when user wants to create videos from text prompts, images, or reference materials.
Generate high-quality images from text prompts using Volcano Engine Seedream models. Supports multiple artistic styles and aspect ratios. Use this skill when users want to create images from text descriptions, generate artwork in various styles, create visual content for creative projects, or need AI-powered image generation capabilities.
Generate music using Volcengine Imagination API. Supports vocal songs, instrumental BGM, and lyrics generation. Use when the user wants to create songs, background music, soundtracks, write lyrics, or mentions "music generation", "BGM", or "songwriting".
Provides image processing capabilities for objects in Bytedance TOS using the official SDK. Supports getting image info, format conversion, resizing, and watermarking. Use when you need to analyze or transform images stored in TOS.
使用 video_generate.py 脚本生成视频,需要提供文件名和 prompt,可选提供首帧图片(URL或本地路径)。
Uses Volcengine TOS SDK object processing (e.g., `video/info`, `video/snapshot`) to fetch video metadata and extract single or multiple frame snapshots from videos stored in Bytedance TOS. Use when the user needs video info/metadata, thumbnail or frame capture, snapshot extraction, or mentions TOS video processing.
Video content understanding operator (las_vlm_video) via Doubao models. Use this skill when user needs to: - Analyze/describe video content with natural language prompts - Ask questions about what happens in a video (objects, actions, scenes) - Summarize video, extract key events, or generate captions Supports public/intranet-accessible video URLs and returns model responses + compression metadata. Requires LAS_API_KEY for authentication.
Volcengine AI MediaKit audio and video processing skill. It is triggered when users need to process or edit audio/video content. After processing, it automatically checks task status and returns playback links for the generated outputs. Core capabilities are grouped into five categories: 1) Video processing: multi-clip stitching, clip trimming, frame flipping, video speed adjustment, audio speed adjustment, image-to-video generation, audio-video composition, audio track extraction, and audio mixing; 2) Audio processing: vocal/accompaniment separation and audio noise reduction; 3) Video enhancement: comprehensive quality restoration, AI super-resolution, and intelligent frame interpolation; 4) AI content analysis: ASR speech-to-text, OCR text extraction, subtitle removal, subtitle embedding, intelligent scene slicing, portrait matting, green screen matting, media info query, and highlight extraction; 5) AI content generation: comic style transfer, AI video translation, AI drama recap narration, and AI drama sc
Video resolution resize operator (las_video_resize). Use this skill when user needs to: - Resize video resolution into a target range (min/max width/height) - Preserve aspect ratio with increase/decrease/disable strategies - Control encoding quality options for GPU NVENC (cq/rc) Supports input from public URL/intranet URL/TOS and outputs to TOS. If user provides local video files or requires local outputs, use byted-tosfile-access to upload/download as a TOS bridge. Requires LAS_API_KEY for authentication.
Audio extract and split operator. Use this skill when user needs to: - Extract audio from video files (mp4, wmv, etc.) - Split audio into segments of specific duration - Convert audio format (wav, mp3, flac) Supports input from TOS and output to TOS. Requires LAS_API_KEY for authentication.
Image resampling operator for downsampling images. Use this skill when user needs to: - Resize/downsample images to target size - Change image DPI settings - Convert between JPG/PNG formats Supports 4 interpolation methods: nearest, bilinear, bicubic, lanczos. Supports input from URL, TOS, base64, or binary. Requires LAS_API_KEY for authentication.