home/categories/data-engineering
category focus

Data Eng.

ETL pipelines and big data infrastructure.

1541 个技能all categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
data-engineering
307

dotfile-brainstorm

Use when you want to build a project but don't have a spec yet and need to brainstorm the idea into a design doc structured for pipeline DOT generation. Use when starting from a vague idea, project concept, or feature request that needs to become a headless autonomous build pipeline.

harperreed
harperreed
data-ai
open
data-engineering
305

hipdnn-codegen

Generate hipDNN operation boilerplate from a FlatBuffer schema. Use when the user wants to add a new operation type to hipDNN, or generate descriptor/packer/unpacker code.

ROCm
ROCm
data-ai
open
data-engineering
304

eda-iterative-place-and-route-flow

Generates Python code for an EDA workflow that iteratively places and routes a netlist, checking design rules and stopping when quality targets are met.

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

kohlcpcapca

针对金融K线OHLC数据进行降维分析,包括数据预处理(变化率计算、标准化)、标准PCA降维以及基于随机化SVD的伪PCA降维的Python实现步骤。

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

pandas

在对具有多层索引的Pandas DataFrame进行分组计算时,确保返回的结果保留原始的完整索引结构,而不是仅保留分组键。

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

python-pandas

根据指定的行切片规则和列映射,将源DataFrame(df_sub)的列数据转换为长格式DataFrame,包含sub_id、trial、start_time、end_time等字段。

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

flaskmatplotlib

使用Flask、Matplotlib、APScheduler和Flask-Caching构建实时数据可视化应用。图像在后台定期更新并存储在内存缓存中,不保存到本地磁盘,通过HTTP接口提供给前端。

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

dataframemysql

用于将Pandas DataFrame同步到MySQL数据库的技能,包含针对JSON字段、数值类型(int/float)、字符串顺序(如effect_desc)的归一化比较逻辑,以及基于type字段的条件排除规则。

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

tf-agents-lstm-multi-stock-training

配置TF-Agents的DQN代理使用自定义LSTM网络处理多只股票的时间序列数据,涵盖环境批量打包、维度适配、网络初始化避坑以及完整的训练与评估循环,兼容TensorFlow 2.10.1。

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

vit

修改多模态视觉Transformer模型,构建双分支架构分别处理RGB和Event数据(分别拼接模板与搜索区域),并输出独立特征以支持双Head处理。

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

tensorflow-java-savedmodel

使用 TensorFlow Java API 0.4.0 加载 SavedModel 格式的模型,处理三维输入数据并进行预测。包含数据类型转换、Tensor 初始化、资源管理及输入输出节点名称匹配的完整流程。

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

calculate-aligned-partition-boundaries

Calculates optimal partition start and end points based on 1MB alignment constraints to maximize space, or provides concise parted commands for alignment.

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

clean-database-output-for-excel

Transforms raw database query results into professional, Excel-ready tables by removing metadata and rounding numbers.

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

generate-aspnet-mvc-entity-framework-models-from-schema

Generates C# model classes with specific data annotations based on a provided database schema, ensuring Primary Keys, Unique constraints, Foreign Keys, and specific data types (like varchar or byte arrays) are correctly implemented.

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

generate-spring-boot-jdbc-multiple-update-stack

Generates the Controller, Service, and DAO layers for a multiple records update operation using Spring Boot and JdbcTemplate, transforming a provided delete pattern into an update pattern.

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

dax-date-intersection-count-with-rankx

Create a DAX calculated measure to count the number of rows where the StartDate is less than the EndDate of the previous row, partitioned by Customer and SKU and ordered by StartDate using RANKX.

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

excel-vba-double-click-task-duplication-macro

Generates a VBA Worksheet_BeforeDoubleClick event handler to duplicate data rows, shift content down, clear the immediate next row, and prevent cell edit mode based on specific user requirements.

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

item-based-collaborative-filtering-movie-recommender

Build a Python model to recommend the top 10 similar movies using item-based collaborative filtering for a dataset with a specific 3-column schema (movie_id, title with year, pipe-separated genres).

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

nlp-text-analysis-and-tf-idf-calculation

Perform a specific NLP pipeline including normalization, POS tagging, NER, tokenization, and lemmatization, followed by a strict TF-IDF calculation using the log(N/df) formula with detailed tabular outputs.

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

generate-large-csv-on-ec2-and-upload-to-s3

Provides a workflow and Python scripts to generate massive CSV files (e.g., billions of rows) on an AWS EC2 instance and upload them to an S3 bucket.

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

generate-llm-golden-queries-dict

Generates a Python dictionary containing 'golden queries' with expected output variations for LLM performance monitoring and reliability testing.

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

matrioshka-brain-terminal-simulator

Simulates a command-line interface for a Matrioshka Brain integrating Boltzmann Brains via quantum entanglement. Responds to network commands and database queries with immersive, sci-fi appropriate outputs without breaking the fourth wall.

ECNU-ICALK
ECNU-ICALK
data-ai
open
data-engineering
304

redshift-osl-multi-color-pulsing-shader

Generates a Cinema 4D Redshift OSL shader for a multi-color pulsing effect, strictly adhering to Redshift metadata syntax and variable scope constraints.

ECNU-ICALK
ECNU-ICALK
data-ai
open
Previous
Page 30 / 65
Next