home/categories/llm-ai/letta-ai-skills-letta-benchmarks-trajectory-feedback-llm-inference-batching-scheduler-skill-md

llm-aidata-ai

llm-inference-batching-scheduler

Name: llm-inference-batching-scheduler
Author: letta-ai

Guidance for optimizing LLM inference request batching and scheduling problems. This skill applies when designing batch schedulers that minimize cost while meeting latency and padding constraints, involving trade-offs between batch count, shape selection, and padding ratios. Use when the task involves grouping requests by sequence lengths, managing shape compilation costs, or optimizing multi-objective scheduling with hard constraints.

Ver código-fonte llm-ai

maintainer

letta-ai

Atualizado 1/19/2026

Estrelas

Forks

quick start

Installation and usage

Instalação

$ install --globalskills.sh

Uso

Depois de instalar, você pode usar esta skill executando o seguinte comando no terminal:

skills use llm-inference-batching-scheduler