geo-database
Access NCBI GEO for gene expression/genomics data. Search/download microarray and RNA-seq datasets (GSE, GSM, GPL), retrieve SOFT/Matrix files, for transcriptomics and expression analysis.
Access NCBI GEO for gene expression/genomics data. Search/download microarray and RNA-seq datasets (GSE, GSM, GPL), retrieve SOFT/Matrix files, for transcriptomics and expression analysis.
This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.
Query Ensembl genome database REST API for 250+ species. Gene lookups, sequence retrieval, variant analysis, comparative genomics, orthologs, VEP predictions, for genomic research.
Modal analysis of a membrane STL using Kirchhoff plate FEM (scipy eigensolver). Takes a binary STL + material properties JSON, constructs a 2D rectangular FEM mesh, assembles stiffness and mass matrices, extracts the first N eigenfrequencies, and reports whether any mode falls in a target frequency range. Returns artifact JSON with eigenfrequencies_hz, mode_shapes_png, and target_range_pass.
3D tetrahedral FEM modal analysis of a membrane STL. Takes a binary STL (mm units) + material properties JSON, repairs surface mesh, generates tetrahedral volume mesh via TetGen, assembles 3D stiffness/mass matrices with jax-fem, solves the generalised eigenvalue problem, and reports eigenfrequencies + mode shapes. Returns artifact JSON with eigenfrequencies_hz, eigenfrequencies_khz, modes_in_range, target_range_pass, and paths to summary PNG and CSV.
Multimodal reasoning LLM for protein function prediction integrating protein embeddings with biological context to generate structured reasoning traces and functional annotations.
ToolUniverse workflow — Structural Variant Analysis
Data structure for annotated matrices in single-cell analysis. Use when working with .h5ad files or integrating with the scverse ecosystem. This is the data format skill—for analysis workflows use scanpy; for probabilistic models use scvi-tools; for population-scale queries use cellxgene-census.
ToolUniverse workflow — Spatial Transcriptomics
Query the CELLxGENE Census (61M+ cells) programmatically. Use when you need expression data across tissues, diseases, or cell types from the largest curated single-cell atlas. Best for population-scale queries, reference atlas comparisons. For analyzing your own data use scanpy or scvi-tools.
PyTorch-native graph neural networks for molecules and proteins. Use when building custom GNN architectures for drug discovery, protein modeling, or knowledge graph reasoning. Best for custom model development, protein property prediction, retrosynthesis. For pre-trained models and diverse featurizers use deepchem; for benchmark datasets use pytdc.
Query STRING API for protein-protein interactions (59M proteins, 20B interactions). Network analysis, GO/KEGG enrichment, interaction discovery, 5000+ species, for systems biology.
Protein sequence, function, and annotation lookup. Query MUST be a bare gene symbol or protein name — 1 to 3 words maximum. Valid examples: 'KRAS', 'EGFR', 'BTK', 'TP53', 'Bruton tyrosine kinase', 'P01116'. If the topic is 'sotorasib KRAS G12C', the correct query is 'KRAS'. If the topic is 'imatinib BCR-ABL resistance', the correct query is 'BCR-ABL'. Strip the drug name, mutation label, and all mechanism words — pass only the protein or gene name.
Create publication-quality scientific diagrams using Nano Banana Pro AI with smart iterative refinement. Uses Gemini 3 Pro for quality review. Only regenerates if quality is below threshold for your document type. Specialized in neural network architectures, system diagrams, flowcharts, biological pathways, and complex scientific visualizations.
Biological data toolkit. Sequence analysis, alignments, phylogenetic trees, diversity metrics (alpha/beta, UniFrac), ordination (PCoA), PERMANOVA, FASTA/Newick I/O, for microbiome analysis.
Standard single-cell RNA-seq analysis pipeline. Use for QC, normalization, dimensionality reduction (PCA/UMAP/t-SNE), clustering, differential expression, and visualization. Best for exploratory scRNA-seq analysis with established workflows. For deep learning models use scvi-tools; for data format questions use anndata.
ToolUniverse workflow — Rare Disease Diagnosis
ToolUniverse workflow — Protein Structure Retrieval
ToolUniverse workflow — Multiomic Disease Characterization