Back to registry
CUDA16GB VRAM
arnav080/sglang-memory-stress
Stress test recipe for SGLang Docker engine configuring detailed memory, batching, and routing constraints
Configuration Specifications
| Base Model | huggingface:Qwen/Qwen2.5-7B-Instruct |
| Engine | llama.cpp |
| Quantization | Q4_K_M |
| Platform | cuda |
| VRAM Required | 16GB minimum |
Telemetry Benchmarks
No runs recorded yet
Be the first to benchmark this recipe! Run the CLI command from your terminal:
bloc run arnav080/sglang-memory-stress
sglang-memory-stress.yaml
Loading workspace editor...