home/categories/llm-ai/letta-ai-skills-letta-benchmarks-trajectory-feedback-llm-inference-batching-scheduler-skill-md

llm-aidata-ai

llm-inference-batching-scheduler

Name: llm-inference-batching-scheduler
Author: letta-ai

Guidance for optimizing LLM inference request batching and scheduling problems. This skill applies when designing batch schedulers that minimize cost while meeting latency and padding constraints, involving trade-offs between batch count, shape selection, and padding ratios. Use when the task involves grouping requests by sequence lengths, managing shape compilation costs, or optimizing multi-objective scheduling with hard constraints.

Voir le code source llm-ai

maintainer

letta-ai

Mis à jour 1/19/2026

Étoiles

Forks

quick start

Installation and usage

Installation

$ install --globalskills.sh

Utilisation

Après l'installation, vous pouvez utiliser ce skill en exécutant la commande suivante dans votre terminal :

skills use llm-inference-batching-scheduler