gwas-database
Query the NHGRI-EBI GWAS Catalog to retrieve SNP–trait associations, study metadata, and (when available) summary statistics when you need evidence for a variant, trait/disease, gene, or genomic region.
gene-database
Query the NCBI Gene database via E-utilities and the NCBI Datasets API; use it when you need to search genes by symbol/ID and retrieve annotations (RefSeq, GO, location, phenotype) for single or batch gene lists.
ensembl-database
Access Ensembl REST API for vertebrate genomic data; use when you need gene/ID lookups, sequence retrieval, variant effect prediction (VEP), or homology/assembly coordinate mapping.
encori-api
Access ENCORI (StarBase) database for miRNA-target, RNA-RNA, and other regulatory data. Invoke when user asks to search ENCORI or retrieve regulatory interactions.
ena-database
Access the European Nucleotide Archive (ENA) via REST APIs and FTP/Aspera to search and retrieve sequences, raw reads (FASTQ), assemblies, and metadata when you have accession IDs or need metadata-driven discovery for genomics pipelines.
bio-ontology-mapper
Map unstructured biomedical text to standardized ontologies (SNOMED CT.
neuropixels-analysis
End-to-end Neuropixels extracellular electrophysiology analysis (SpikeGLX/Open Ephys/NWB) including preprocessing, motion correction, Kilosort4 spike sorting, QC metrics, and Allen/IBL-style curation; use when processing Neuropixels recordings or when users mention Neuropixels, SpikeGLX, Open Ephys, Kilosort, quality metrics, drift/motion correction, or unit curation.
neoantigen-predictor
Predict neoantigens that may be recognized by the immune system based.
motif-logo-generator
Generate publication-quality sequence logos for DNA or protein motifs.
spatial-transcriptomics-mapper
Map spatial transcriptomics data from 10x Genomics Visium/Xenium onto.
sequence-alignment
A skill for performing sequence alignment using NCBI BLAST API. Supports nucleotide and protein sequence comparison against major biological databases.
scvi-tools
Deep generative models for single-cell omics; use when you need probabilistic batch correction (scVI), transfer learning, uncertainty-aware differential expression, or multimodal integration (totalVI/MultiVI).
scrna-cell-type-annotator
Auto-annotate cell clusters from single-cell RNA data using marker genes.
scikit-bio
A Python bioinformatics toolkit for sequence, phylogeny, and microbiome/community-ecology analysis; use it when you need to compute diversity/ordination/statistics from biological data and standard formats (FASTA/FASTQ/Newick/BIOM).
scanpy
Standard single-cell RNA-seq analysis pipeline. For quality control (QC), normalization, dimensionality reduction (PCA/UMAP/t-SNE), clustering, differential expression analysis, and visualization. Best suited for exploratory single-cell transcriptomics analysis using established workflows. For deep learning models, use scvi-tools; for data format issues, use anndata.
phylogenetic-tree-styler
Analyze data with `phylogenetic-tree-styler` using a reproducible workflow, explicit validation, and structured outputs for review-ready interpretation.
pathology-roi-selector
Use pathology roi selector for data analysis workflows that need structured execution, explicit assumptions, and clear output boundaries.
pathml
A full-featured computational pathology toolkit for advanced WSI analysis, including multiplexed immunofluorescence (CODEX, Vectra), nuclei segmentation, tissue graph construction, and machine learning model training on pathology data. Supports over 160 slide formats. For simple tile extraction from H&E slides, histolab may be simpler.
fiftyone-find-duplicates
Find duplicate or near-duplicate images in FiftyOne datasets using brain similarity computation. Use when users want to deduplicate datasets, find similar images, cluster visually similar content, or remove redundant samples. Requires FiftyOne MCP server with @voxel51/brain plugin installed.
bn-fit-modify
Guide for Bayesian Network tasks involving structure learning, parameter fitting, intervention, and sampling. This skill should be used when working with pgmpy or similar libraries to recover DAG structures from data, fit conditional probability distributions, perform causal interventions (do-calculus), or sample from modified networks.