home/categories/llm-ai/letta-ai-skills-letta-benchmarks-trajectory-feedback-llm-inference-batching-scheduler-skill-md
llm-aidata-ai

llm-inference-batching-scheduler

Guidance for optimizing LLM inference request batching and scheduling problems. This skill applies when designing batch schedulers that minimize cost while meeting latency and padding constraints, involving trade-offs between batch count, shape selection, and padding ratios. Use when the task involves grouping requests by sequence lengths, managing shape compilation costs, or optimizing multi-objective scheduling with hard constraints.

letta-ai
maintainer
letta-ai
์—…๋ฐ์ดํŠธ๋จ 1/19/2026
์Šคํƒ€
31
ํฌํฌ
5
quick start

Installation and usage

Guidance for optimizing LLM inference request batching and scheduling problems. This skill applies when designing batch schedulers that minimize cost while meeting latency and padding constraints, involving trade-offs between batch count, shape selection, and padding ratios. Use when the task involves grouping requests by sequence lengths, managing shape compilation costs, or optimizing multi-objective scheduling with hard constraints.

์„ค์น˜
$ install --globalskills.sh
์‚ฌ์šฉ๋ฒ•

์„ค์น˜ ํ›„ ํ„ฐ๋ฏธ๋„์—์„œ ๋‹ค์Œ ๋ช…๋ น์„ ์‹คํ–‰ํ•˜์—ฌ ์ด ์Šคํ‚ฌ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

skills use llm-inference-batching-scheduler