home/categories/research
domain cluster

Research

Scientific computing and academic tools.

8969 مهارةall categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
scientific-computing
1

gene-database

Query NCBI Gene via E-utilities/Datasets API. Search by symbol/ID, retrieve gene info (RefSeqs, GO, locations, phenotypes), batch lookups, for gene annotation and functional analysis.

hxk622
hxk622
research
open
bioinformatics
1

flowio

Parse FCS (Flow Cytometry Standard) files v2.0-3.1. Extract events as NumPy arrays, read metadata/channels, convert to CSV/DataFrame, for flow cytometry data preprocessing.

hxk622
hxk622
research
open
scientific-computing
1

deepchem

Molecular ML with diverse featurizers and pre-built datasets. Use for property prediction (ADMET, toxicity) with traditional ML or GNNs when you want extensive featurization options and MoleculeNet benchmarks. Best for quick experiments with pre-trained models, diverse molecular representations. For graph-first PyTorch workflows use torchdrug; for benchmark datasets use pytdc.

hxk622
hxk622
research
open
scientific-computing
1

clinpgx-database

Access ClinPGx pharmacogenomics data (successor to PharmGKB). Query gene-drug interactions, CPIC guidelines, allele functions, for precision medicine and genotype-guided dosing decisions.

hxk622
hxk622
research
open
scientific-computing
1

pyopenms

Complete mass spectrometry analysis platform. Use for proteomics workflows feature detection, peptide identification, protein quantification, and complex LC-MS/MS pipelines. Supports extensive file formats and algorithms. Best for proteomics, comprehensive MS data processing. For simple spectral comparison and metabolite ID use matchms.

hxk622
hxk622
research
open
bioinformatics
1

neuropixels-analysis

Neuropixels neural recording analysis. Load SpikeGLX/OpenEphys data, preprocess, motion correction, Kilosort4 spike sorting, quality metrics, Allen/IBL curation, AI-assisted visual analysis, for Neuropixels 1.0/2.0 extracellular electrophysiology. Use when working with neural recordings, spike sorting, extracellular electrophysiology, or when the user mentions Neuropixels, SpikeGLX, Open Ephys, Kilosort, quality metrics, or unit curation.

hxk622
hxk622
research
open
scientific-computing
1

pysam

Genomic file toolkit. Read/write SAM/BAM/CRAM alignments, VCF/BCF variants, FASTA/FASTQ sequences, extract regions, calculate coverage, for NGS data processing pipelines.

hxk622
hxk622
research
open
scientific-computing
1

gtars

High-performance toolkit for genomic interval analysis in Rust with Python bindings. Use when working with genomic regions, BED files, coverage tracks, overlap detection, tokenization for ML models, or fragment analysis in computational genomics and machine learning applications.

hxk622
hxk622
research
open
bioinformatics
1

mechinterp-investigator

Orchestrate a systematic research program to investigate and meaningfully label SAE features

cesaregarza
cesaregarza
research
open
scientific-computing
1

alphafold-database

Access AlphaFold 200M+ AI-predicted protein structures. Retrieve structures by UniProt ID, download PDB/mmCIF files, analyze confidence metrics (pLDDT, PAE), for drug discovery and structural biology.

hxk622
hxk622
research
open
bioinformatics
1

lamindb

This skill should be used when working with LaminDB, an open-source data framework for biology that makes data queryable, traceable, reproducible, and FAIR. Use when managing biological datasets (scRNA-seq, spatial, flow cytometry, etc.), tracking computational workflows, curating and validating data with biological ontologies, building data lakehouses, or ensuring data lineage and reproducibility in biological research. Covers data management, annotation, ontologies (genes, cell types, diseases, tissues), schema validation, integrations with workflow managers (Nextflow, Snakemake) and MLOps platforms (W&B, MLflow), and deployment strategies.

hxk622
hxk622
research
open
scientific-computing
1

uniprot-database

Direct REST API access to UniProt. Protein searches, FASTA retrieval, ID mapping, Swiss-Prot/TrEMBL. For Python workflows with multiple databases, prefer bioservices (unified interface to 40+ services). Use this for direct HTTP/REST work or UniProt-specific control.

hxk622
hxk622
research
open
bioinformatics
1

fabric-ai

Guides pattern selection through decomposition, exploration, and diverse approach generation

rafaelcalleja
rafaelcalleja
research
open
scientific-computing
1

esm

Comprehensive toolkit for protein language models including ESM3 (generative multimodal protein design across sequence, structure, and function) and ESM C (efficient protein embeddings and representations). Use this skill when working with protein sequences, structures, or function prediction; designing novel proteins; generating protein embeddings; performing inverse folding; or conducting protein engineering tasks. Supports both local model usage and cloud-based Forge API for scalable inference.

hxk622
hxk622
research
open
scientific-computing
1

pdb-database

Access RCSB PDB for 3D protein/nucleic acid structures. Search by text/sequence/structure, download coordinates (PDB/mmCIF), retrieve metadata, for structural biology and drug discovery.

hxk622
hxk622
research
open
scientific-computing
1

molfeat

Molecular featurization for ML (100+ featurizers). ECFP, MACCS, descriptors, pretrained models (ChemBERTa), convert SMILES to features, for QSAR and molecular ML.

hxk622
hxk622
research
open
bioinformatics
1

scvi-tools

Deep generative models for single-cell omics. Use when you need probabilistic batch correction (scVI), transfer learning, differential expression with uncertainty, or multi-modal integration (TOTALVI, MultiVI). Best for advanced modeling, batch effects, multimodal data. For standard analysis pipelines use scanpy.

hxk622
hxk622
research
open
scientific-computing
1

matchms

Spectral similarity and compound identification for metabolomics. Use for comparing mass spectra, computing similarity scores (cosine, modified cosine), and identifying unknown compounds from spectral libraries. Best for metabolite identification, spectral matching, library searching. For full LC-MS/MS proteomics pipelines use pyopenms.

hxk622
hxk622
research
open
scientific-computing
1

rdkit

Modern RDKit workflows for cheminformatics, including molecular fingerprints, drawing, and property calculations. Use when working with molecules, SMILES, molecular fingerprints (Morgan, ECFP, RDKit, atom pairs, topological torsions), molecule visualization/drawing, substructure search, or chemical property calculations. This skill provides up-to-date syntax patterns as RDKit's API evolves.

mcox3406
mcox3406
research
open
bioinformatics
1

arboreto

Infer gene regulatory networks (GRNs) from gene expression data using scalable algorithms (GRNBoost2, GENIE3). Use when analyzing transcriptomics data (bulk RNA-seq, single-cell RNA-seq) to identify transcription factor-target gene relationships and regulatory interactions. Supports distributed computation for large-scale datasets.

hxk622
hxk622
research
open
scientific-computing
1

kegg-database

Direct REST API access to KEGG (academic use only). Pathway analysis, gene-pathway mapping, metabolic pathways, drug interactions, ID conversion. For Python workflows with multiple databases, prefer bioservices. Use this for direct HTTP/REST work or KEGG-specific control.

hxk622
hxk622
research
open
scientific-computing
1

cosmic-database

Access COSMIC cancer mutation database. Query somatic mutations, Cancer Gene Census, mutational signatures, gene fusions, for cancer research and precision oncology. Requires authentication.

hxk622
hxk622
research
open
Previous
Page 321 / 374
Next