Back to registry
CUDA16GB VRAM

arnav080/sglang-memory-stress

Stress test recipe for SGLang Docker engine configuring detailed memory, batching, and routing constraints

Configuration Specifications

Base Modelhuggingface:Qwen/Qwen2.5-7B-Instruct
Enginellama.cpp
QuantizationQ4_K_M
Platformcuda
VRAM Required16GB minimum

Telemetry Benchmarks

No runs recorded yet

Be the first to benchmark this recipe! Run the CLI command from your terminal:

bloc run arnav080/sglang-memory-stress
sglang-memory-stress.yaml
Loading workspace editor...