multimodal-extraction
Given a local video or video URL, downloads the media if needed, extracts slide frames and key moments, transcribes the audio, and writes a Markdown timeline that interleaves screenshots with the transcript at the associated timestamps. Use when asked to turn a video into a multimodal notes file, slide-synced transcript, screenshot-enhanced transcript, or talk recap with images.
Installation and usage
Given a local video or video URL, downloads the media if needed, extracts slide frames and key moments, transcribes the audio, and writes a Markdown timeline that interleaves screenshots with the transcript at the associated timestamps. Use when asked to turn a video into a multimodal notes file, slide-synced transcript, screenshot-enhanced transcript, or talk recap with images.
После установки вы можете использовать этот skill, выполнив следующую команду в терминале:
skills use multimodal-extraction