home/categories/media

category focus

Media

Audio, video, and image processing.

1476 skillsall categories

sorting

stars

current ordering strategy

query

all entries

refine the visible subset

media

304

c

提供使用 C# 读取视频文件时长并将其格式化为指定位数（如7位数字符串）的代码方案。支持多种库（如FFmpeg、DirectShow.NET、WMPLib、FFmpeg.AutoGen）或命令行调用方式。

ECNU-ICALK

content-media

open

media

304

python-ffmpeg

使用Python的subprocess模块调用ffmpeg，实现图片序列的3x3网格合并以及反向拆分，支持用户自定义输入输出路径。

ECNU-ICALK

content-media

open

media

304

xy-dng-ng-dng-x-l-nh-opencv

Tạo mã nguồn ứng dụng xử lý ảnh (GUI hoặc Web) sử dụng OpenCV với các chức năng cơ bản: đọc ảnh, chuyển xám, cắt ảnh, xoay/lật, làm mịn và phát hiện cạnh.

ECNU-ICALK

content-media

open

media

304

configuracin-y-edicin-de-video-vertical-en-adobe-premiere-cs6

Guía especializada para configurar secuencias verticales, corregir la orientación de clips en masa, aplicar efectos dinámicos y exportar videos optimizados para Instagram utilizando exclusivamente funciones de Adobe Premiere CS6.

ECNU-ICALK

content-media

open

media

304

video-anomaly-detection-with-videomae

A Python program to detect anomalies in videos using the VideoMAEForPreTraining model. It processes videos by dividing them into 16-frame clips, extracts embeddings using an unmasked boolean mask, and compares them against a normal behavior profile using Mean Squared Error (MSE).

ECNU-ICALK

content-media

open

media

304

video-segment-extraction-audio-loudness

Script Python pour extraire des segments vidéo basés sur les pics d'amplitude audio, offrant à l'utilisateur le choix de placer ce pic au début (1/3), au milieu (1/2), à la fin (2/3) ou aléatoirement dans le segment extrait.

ECNU-ICALK

content-media

open

media

304

musyq-fvc-hdf5

使用Python处理MuSyQ FVC数据的HDF5格式文件，包括检查数据结构、提取FVC及红蓝近红外波段、转换为TIFF格式以及进行图像可视化。

ECNU-ICALK

content-media

open

media

304

extraction-video-par-intensite-audio

Génère un script Python complet pour analyser l'audio de vidéos (RMS), identifier les moments forts, trier les résultats et extraire les segments correspondants via ffmpeg, avec un positionnement précis du pic sonore (1/3, 1/2, 2/3).

ECNU-ICALK

content-media

open

media

304

python-moviepy-subtitle-overlay-with-extended-duration

Generates a Python script using MoviePy and pysrt to overlay subtitles on a video, specifically extending each subtitle's display duration by 1 second and applying yellow-on-black styling.

ECNU-ICALK

content-media

open

media

304

opencv-video-nine-grid-with-style

使用OpenCV从视频中均匀提取9帧画面，合并为3x3九宫格图片，并为每帧添加边框及指定大小的文字标注。

ECNU-ICALK

content-media

open

media

304

load-inverted-grayscale-image-with-pil

Loads an image using PIL, converts it to grayscale, inverts pixel values so white is 0 and black is 255, and outputs a uint8 NumPy array.

ECNU-ICALK

content-media

open

media

304

youtube-audio-download-and-split-script-generator

Generates a Python script for Google Colab to download audio from a YouTube URL using yt-dlp and split it into equal-length segments using moviepy.

ECNU-ICALK

content-media

open

media

304

vad

Генерирует Python код для визуализации аудиосигнала с динамическими границами VAD или статическими порогами энергии, с приоритетом оси времени и заголовков с именем файла.

ECNU-ICALK

content-media

open

media

304

python-ffmpeg-video-creation-from-image-sequence

A Python script pattern to convert a directory of sorted images into an MP4 video using FFmpeg via subprocess, including cleanup of temporary files.

ECNU-ICALK

content-media

open

media

304

twitter-media-extraction-and-telegram-sender

Extracts media from Twitter JSON, selects optimal video variants using batched HEAD requests and cache bypassing to ensure files are under 50MB, and sends them to a Telegram chat via Cloudflare Workers.

ECNU-ICALK

content-media

open

media

304

native-audio-dsp-filter-implementation

Implement native Node.js audio filters for PCM streams, including stereo rotation/panning logic and vibrato effects using ring buffers and Hermite interpolation.

ECNU-ICALK

content-media

open

media

304

audio-mel-spectrogram-preprocessing-with-min-width-trimming

Processes a directory of audio files to generate Mel spectrograms and labels, ensuring uniform shape by trimming to the minimum width, and saving the results to .npy files.

ECNU-ICALK

content-media

open

media

304

opencv-image-processing-with-library-constraints

Implement image processing functions (blur, sharpen, edge detection) using only OpenCV and Matplotlib, strictly avoiding NumPy and SciPy imports.

ECNU-ICALK

content-media

open

media

304

python-image-dataset-loader-and-caption-filter

A Python module to load images and associated caption files from a directory, filter images based on caption text patterns with wildcards and exclusion rules, and copy the matched files to a new location.

ECNU-ICALK

content-media

open

media

304

paired-image-text-dataset-loader

Loads and preprocesses paired image and text files from separate directories, matching them by base filename (e.g., screen_13.png with html_13.html) for machine learning training.

ECNU-ICALK

content-media

open

media

304

audio-filename-parsing-for-zero-prefixed-numbers

Parses text strings to map to audio files. If text starts with '0' followed by a digit (e.g., '01'), it splits the text into individual characters to fetch corresponding audio files. Otherwise, it uses the whole text as the filename.

ECNU-ICALK

content-media

open

media

304

audio-dataset-loading-and-stft-feature-extraction

Load audio files from a directory, parse labels from filenames, generate random VAD segments, extract STFT features (mean along axis 1, converted to dB), and split the dataset into train/test sets.

ECNU-ICALK

content-media

open

media

304

fix-native-audio-filter-stream-output

Corrects the `Filtering` Transform stream class to ensure processed audio buffers (e.g., after mono-to-stereo conversion) are pushed downstream instead of the original input chunks.

ECNU-ICALK

content-media

open

media

304

automated-audio-recognition-and-tagging-workflow

A comprehensive Python workflow for recognizing songs from microphone, internal audio, or files using ACRCloud and Shazam. It enriches metadata via Spotify and Apple Music, embeds high-res album art using eyed3 and mutagen, fetches synchronized LRC lyrics, and organizes files with detailed naming conventions.

ECNU-ICALK

content-media

open

Page 15 / 62