category focus

LLM & AI

Large Language Models and AI agents.

4725 個技能all categories
sorting
stars
current ordering strategy
query
all entries
refine the visible subset
llm-ai
18.2K

harness-writing

The agent writes and improves fuzzing harnesses — the entrypoint functions that receive random data from fuzzers and route it to the system under test (SUT). It implements LLVMFuzzerTestOneInput for C/C++ with libFuzzer and AFL++ persistent mode, fuzz_target! macros for Rust with cargo-fuzz and the arbitrary crate, and go-fuzz Fuzz functions for Go. The agent structures inputs using FuzzedDataProvider, applies interleaved fuzzing patterns for multi-operation targets, handles input size validation, resets global state for determinism, and mocks blocking I/O. It applies this technique when creating new fuzz targets, improving code coverage of existing harnesses, fixing non-reproducible crashes, or building structure-aware harnesses with Protocol Buffers.

elizaOS
elizaOS
data-ai
open
llm-ai
18.2K

libafl

The agent uses LibAFL, a modular Rust fuzzing library, to build custom fuzzers with fine-grained control over observers, feedback mechanisms, mutators, schedulers, and executors. It supports drop-in libFuzzer replacement mode via libFuzzer.a, fully custom fuzzer construction with InProcessExecutor and coverage-guided feedback, multi-core fuzzing with Launcher, crash deduplication via BacktraceObserver, and dictionary-based token mutations. The agent applies LibAFL when standard fuzzers like libFuzzer or AFL++ lack needed customization — such as custom mutation strategies, novel feedback mechanisms, non-standard target architectures, or fuzzing research requiring component-level control over the fuzzing loop, corpus management, and sanitizer integration.

elizaOS
elizaOS
data-ai
open
llm-ai
18.2K

voice-call

Initiates, manages, and inspects voice calls through the Otto voice-call plugin using Twilio, Telnyx, Plivo, or mock providers. Supports starting outbound calls, continuing conversations, speaking messages, ending calls, and checking call status. Use when the user wants to make a phone call, dial a number, place a voice call, check call status, send a voice message, or speak to someone over the phone.

elizaOS
elizaOS
data-ai
open
llm-ai
17.6K

outlines

Guarantee valid JSON/XML/code structure during generation, use Pydantic models for type-safe outputs, support local models (Transformers, vLLM), and maximize inference speed with Outlines - dottxt.ai's structured generation library

davila7
davila7
data-ai
open
llm-ai
17.6K

nemo-curator

GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs with RAPIDS. Use for preparing high-quality training datasets, cleaning web data, or deduplicating large corpora.

davila7
davila7
data-ai
open
llm-ai
17.6K

pufferlib

This skill should be used when working with reinforcement learning tasks including high-performance RL training, custom environment development, vectorized parallel simulation, multi-agent systems, or integration with existing RL environments (Gymnasium, PettingZoo, Atari, Procgen, etc.). Use this skill for implementing PPO training, creating PufferEnv environments, optimizing RL performance, or developing policies with CNNs/LSTMs.

davila7
davila7
data-ai
open
llm-ai
17.6K

segment-anything-model

Foundation model for image segmentation with zero-shot transfer. Use when you need to segment any object in images using points, boxes, or masks as prompts, or automatically generate all object masks in an image.

davila7
davila7
data-ai
open
llm-ai
17.6K

instructor

Extract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, parse complex JSON with type safety, and stream partial results with Instructor - battle-tested structured output library

davila7
davila7
data-ai
open
llm-ai
17.6K

speculative-decoding

Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. Use when optimizing inference speed (1.5-3.6× speedup), reducing latency for real-time applications, or deploying models with limited compute. Covers draft models, tree-based attention, Jacobi iteration, parallel token generation, and production deployment strategies.

davila7
davila7
data-ai
open
llm-ai
17.6K

sglang

Fast structured generation and serving for LLMs with RadixAttention prefix caching. Use for JSON/regex outputs, constrained decoding, agentic workflows with tool calls, or when you need 5× faster inference than vLLM with prefix sharing. Powers 300,000+ GPUs at xAI, AMD, NVIDIA, and LinkedIn.

davila7
davila7
data-ai
open
llm-ai
17.6K

stable-baselines3

Use this skill for reinforcement learning tasks including training RL agents (PPO, SAC, DQN, TD3, DDPG, A2C, etc.), creating custom Gym environments, implementing callbacks for monitoring and control, using vectorized environments for parallel training, and integrating with deep RL workflows. This skill should be used when users request RL algorithm implementation, agent training, environment design, or RL experimentation.

davila7
davila7
data-ai
open
llm-ai
17.6K

unsloth

Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization

davila7
davila7
data-ai
open
llm-ai
17.6K

long-context

Extend context windows of transformer models using RoPE, YaRN, ALiBi, and position interpolation techniques. Use when processing long documents (32k-128k+ tokens), extending pre-trained models beyond original context limits, or implementing efficient positional encodings. Covers rotary embeddings, attention biases, interpolation methods, and extrapolation strategies for LLMs.

davila7
davila7
data-ai
open
llm-ai
17.6K

awq-quantization

Activation-aware weight quantization for 4-bit LLM compression with 3x speedup and minimal accuracy loss. Use when deploying large models (7B-70B) on limited GPU memory, when you need faster inference than GPTQ with better accuracy preservation, or for instruction-tuned and multimodal models. MLSys 2024 Best Paper Award winner.

davila7
davila7
data-ai
open
llm-ai
17.6K

implementing-llms-litgpt

Implements and trains LLMs using Lightning AI's LitGPT with 20+ pretrained architectures (Llama, Gemma, Phi, Qwen, Mistral). Use when need clean model implementations, educational understanding of architectures, or production fine-tuning with LoRA/QLoRA. Single-file implementations, no abstraction layers.

davila7
davila7
data-ai
open
llm-ai
17.6K

llama-cpp

Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. Supports GGUF quantization (1.5-8 bit) for reduced memory and 4-10× speedup vs PyTorch on CPU.

davila7
davila7
data-ai
open
llm-ai
17.6K

constitutional-ai

Anthropic's method for training harmless AI through self-improvement. Two-phase approach - supervised learning with self-critique/revision, then RLAIF (RL from AI Feedback). Use for safety alignment, reducing harmful outputs without human labels. Powers Claude's safety system.

davila7
davila7
data-ai
open
llm-ai
17.6K

whisper

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

davila7
davila7
data-ai
open
llm-ai
17.6K

faiss

Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types (Flat, IVF, HNSW). Use for fast k-NN search, large-scale vector retrieval, or when you need pure similarity search without metadata. Best for high-performance applications.

davila7
davila7
data-ai
open
llm-ai
17.6K

nemo-guardrails

NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses Colang 2.0 DSL for programmable rails. Production-ready, runs on T4 GPU.

davila7
davila7
data-ai
open
llm-ai
17.6K

quantizing-models-bitsandbytes

Quantizes LLMs to 8-bit or 4-bit for 50-75% memory reduction with minimal accuracy loss. Use when GPU memory is limited, need to fit larger models, or want faster inference. Supports INT8, NF4, FP4 formats, QLoRA training, and 8-bit optimizers. Works with HuggingFace Transformers.

davila7
davila7
data-ai
open
llm-ai
17.6K

blip-2-vision-language

Vision-language pre-training framework bridging frozen image encoders and LLMs. Use when you need image captioning, visual question answering, image-text retrieval, or multimodal chat with state-of-the-art zero-shot performance.

davila7
davila7
data-ai
open
llm-ai
17.6K

nanogpt

Educational GPT implementation in ~300 lines. Reproduces GPT-2 (124M) on OpenWebText. Clean, hackable code for learning transformers. By Andrej Karpathy. Perfect for understanding GPT architecture from scratch. Train on Shakespeare (CPU) or OpenWebText (multi-GPU).

davila7
davila7
data-ai
open
llm-ai
17.6K

llamaindex

Data framework for building LLM applications with RAG. Specializes in document ingestion (300+ connectors), indexing, and querying. Features vector indices, query engines, agents, and multi-modal support. Use for document Q&A, chatbots, knowledge retrieval, or building RAG pipelines. Best for data-centric LLM applications.

davila7
davila7
data-ai
open
Previous
Page 15 / 197
Next