home/categories/llm-ai/letta-ai-skills-letta-benchmarks-trajectory-feedback-llm-inference-batching-scheduler-skill-md
llm-aidata-ai

llm-inference-batching-scheduler

Guidance for optimizing LLM inference request batching and scheduling problems. This skill applies when designing batch schedulers that minimize cost while meeting latency and padding constraints, involving trade-offs between batch count, shape selection, and padding ratios. Use when the task involves grouping requests by sequence lengths, managing shape compilation costs, or optimizing multi-objective scheduling with hard constraints.

letta-ai
maintainer
letta-ai
Обновлено 1/19/2026
Звёзды
31
Форки
5
quick start

Installation and usage

Guidance for optimizing LLM inference request batching and scheduling problems. This skill applies when designing batch schedulers that minimize cost while meeting latency and padding constraints, involving trade-offs between batch count, shape selection, and padding ratios. Use when the task involves grouping requests by sequence lengths, managing shape compilation costs, or optimizing multi-objective scheduling with hard constraints.

Установка
$ install --globalskills.sh
Использование

После установки вы можете использовать этот skill, выполнив следующую команду в терминале:

skills use llm-inference-batching-scheduler