domain cluster

Data & AI

Machine learning, LLMs, and data processing.

9743 스킬all categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
data-analysis
94

generate-report

Save investigation findings to a markdown report file. Use after completing triage, enrichment, or investigation to create a permanent record. Generates timestamped files in ./reports/ directory.

dandye
dandye
data-ai
open
llm-ai
93

mcp-builder

Build Model Context Protocol (MCP) servers and tools that extend Claude's capabilities with custom functions, data sources, and integrations. Use when creating custom MCP servers, implementing tools for Claude, building integrations with external services, creating data source connectors, implementing custom functions, or extending Claude's capabilities with domain-specific tools.

korallis
korallis
data-ai
open
data-analysis
93

statistical-significance-annotation

Guide for annotating statistical significance (p-value asterisk notation) on comparison plots. Covers standard notation conventions (ns, *, **, ***, ****), when to annotate, matplotlib bracket+asterisk implementation, and integration with seaborn box/violin/bar plots. Use when generating publication-ready comparison figures that need significance markers to support statistical claims made in the analysis.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

dashboard

Comprehensive usage analytics and epistemic coverage dashboard across all sessions.

jongwony
jongwony
data-ai
open
data-analysis
93

shap-model-explainability

Model interpretability using SHAP (SHapley Additive exPlanations) based on Shapley values from game theory. Covers explainer selection (Tree, Deep, Linear, Kernel, Gradient, Permutation), computing feature attributions, and visualization (waterfall, beeswarm, bar, scatter, force, heatmap). Use when explaining ML model predictions, computing feature importance, debugging model behavior, analyzing fairness/bias, or comparing models. Works with tree-based, deep learning, linear, and black-box models.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

statistical-analysis

Guided statistical analysis: test selection, assumption checking, effect sizes, power analysis, and APA reporting. Use when choosing appropriate tests for your data, verifying assumptions, calculating effect sizes, or formatting results for publication. Covers frequentist (t-test, ANOVA, chi-square, regression, correlation, survival, count models, agreement/reliability) and Bayesian alternatives. For implementing specific models use statsmodels or pymc-bayesian-modeling.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

matplotlib-scientific-plotting

Low-level Python plotting library for full customization of scientific figures. Use for publication-quality plots (line, scatter, bar, heatmap, contour, 3D), multi-panel subplot layouts, and fine-grained control over every visual element. Export to PNG/PDF/SVG. For quick statistical plots use seaborn; for interactive plots use plotly.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

nan-safe-correlation

Per-feature NaN-safe Spearman/Pearson correlation computation. Use when computing correlations across many features (genes, proteins, variants) with missing values. Covers why bulk matrix shortcuts fail with missing data, correct pairwise deletion, degenerate input filtering, and performance optimization for large datasets. For general statistical test selection use statistical-analysis; for model explainability use shap-model-explainability.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

plotly-interactive-plots

Interactive scientific visualization with Plotly. Two-layer API: plotly.express (px) for one-liner DataFrame plots and plotly.graph_objects (go) for full trace-level control. 40+ chart types with hover, zoom, pan, and animation. Exports to interactive HTML or static PNG/SVG/PDF via kaleido. Use for interactive web figures, volcano plots with gene hover info, dose-response dashboards, gene expression heatmaps, and 3D molecular visualizations. Use seaborn for statistical summaries with automatic aggregation; use matplotlib for fine-grained publication figures; use plotly for interactive or web-embedded output.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

plotly-interactive-visualization

Interactive visualization with Plotly. 40+ chart types (scatter, line, bar, heatmap, 3D, statistical, geographic) with hover, zoom, and pan. Use for exploratory analysis, dashboards, and presentations. Two APIs: Plotly Express (quick, DataFrame-oriented) and Graph Objects (fine-grained control). For static publication figures use matplotlib; for statistical grammar use seaborn.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

degenerate-input-filtering

Mandatory filtering of degenerate and uninformative data points before statistical tests. Covers single-sequence alignments, empty files, constant-value features, zero-variance inputs, and all-NaN columns. For NaN-aware correlation computation, see the nan-safe-correlation skill. For broader statistical testing guidance, see the statistical-analysis skill.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

hypothesis-generation

Structured hypothesis formulation from observations. Use when you have experimental observations or data and need to formulate testable hypotheses with predictions, propose mechanisms, and design experiments to test them. Follows scientific method framework. For open-ended ideation use scientific-brainstorming; for automated LLM-driven hypothesis testing on datasets use hypogenic.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

seaborn-statistical-plots

Statistical visualization library built on matplotlib with native pandas DataFrame support. Automatic aggregation, confidence intervals, and grouping for distribution plots (histplot, kdeplot), categorical comparisons (boxplot, violinplot, stripplot), relational plots (scatterplot, lineplot), regression plots (regplot, lmplot), matrix plots (heatmap, clustermap), and multi-variable grids (pairplot, jointplot, FacetGrid). Use seaborn for statistical summaries with minimal code; use matplotlib for fine-grained figure control; use plotly for interactive HTML output.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

seaborn-statistical-visualization

Statistical visualization built on matplotlib with pandas integration. Distribution plots (histplot, kdeplot, violinplot, boxplot), relational plots (scatterplot, lineplot), categorical comparisons, regression, correlation heatmaps. Automatic aggregation and CI. For interactive plots use plotly; for low-level control use matplotlib.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

networkx-graph-analysis

Graph and network analysis toolkit: create, manipulate, and analyze complex networks. Four graph types (directed, undirected, multi-edge), centrality measures, shortest paths, community detection, graph generators, I/O (GraphML, GML, edge list, pandas, NumPy), visualization with matplotlib. For large-scale graphs (100K+ nodes) use igraph or graph-tool; for graph neural networks use PyG.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

gwas-database

NHGRI-EBI GWAS Catalog REST API for SNP-trait associations from published genome-wide association studies. Query studies, associations, variants, traits, genes, and summary statistics. Build polygenic risk score candidates, analyze variant pleiotropy, download summary statistics for Manhattan plots. No authentication required.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

multiqc-qc-reports

Aggregates QC outputs from 150+ bioinformatics tools into a single interactive HTML report. Scans directories for FastQC, samtools, STAR, HISAT2, Trim Galore, featureCounts, Kallisto, Salmon, Picard, and GATK logs; merges statistics across samples with interactive plots. Essential for NGS pipeline QC review. Use FastQC directly instead for single-sample initial assessment; MultiQC is for multi-sample pipeline-wide reporting.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

matlab-scientific-computing

MATLAB/GNU Octave numerical computing for matrix operations, linear algebra, differential equations, signal processing, optimization, statistics, and scientific visualization. Code examples in MATLAB syntax (runs on both MATLAB and Octave). For Python-based scientific computing use numpy/scipy; for statistical modeling use statsmodels.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
93

sympy-symbolic-math

Symbolic mathematics in Python: exact algebra, calculus (derivatives, integrals, limits), equation solving, symbolic matrices, differential equations, code generation (lambdify, C/Fortran). Use when exact symbolic results are needed, not numerical approximations. For numerical computing use numpy/scipy; for statistical modeling use statsmodels.

jaechang-hits
jaechang-hits
data-ai
open
data-analysis
92

data-analysis

Conduct exploratory data analysis and statistical testing with test selection guidance. Use when exploring datasets, selecting statistical tests, performing power analysis, or preparing results for publication.

ChicagoHAI
ChicagoHAI
data-ai
open
data-analysis
92

memory

Structured daily and weekly learning memory with dual graph snapshots.

MathClaw-ruc
MathClaw-ruc
data-ai
open
data-analysis
91

calculator

A simple calculator that can add, subtract, multiply, and divide numbers. Use when the user needs to perform basic arithmetic operations.

EXboys
EXboys
data-ai
open
data-analysis
91

data-analysis

Analyze CSV/JSON data with statistics, filtering, and aggregation. Powered by pandas and numpy.

EXboys
EXboys
data-ai
open
llm-ai
90

data-storytelling

Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.

aiskillstore
aiskillstore
data-ai
open
Previous
Page 192 / 406
Next