home/categories/llm-ai/letta-ai-skills-letta-benchmarks-trajectory-feedback-llm-inference-batching-scheduler-skill-md

llm-aidata-ai

llm-inference-batching-scheduler

Name: llm-inference-batching-scheduler
Author: letta-ai

Guidance for optimizing LLM inference request batching and scheduling problems. This skill applies when designing batch schedulers that minimize cost while meeting latency and padding constraints, involving trade-offs between batch count, shape selection, and padding ratios. Use when the task involves grouping requests by sequence lengths, managing shape compilation costs, or optimizing multi-objective scheduling with hard constraints.

查看源码 llm-ai

maintainer

letta-ai

更新于 1/19/2026

星标

分支

quick start

Installation and usage

安装

$ install --globalskills.sh

使用

安装后，您可以通过在终端运行以下命令来使用此技能：

skills use llm-inference-batching-scheduler