CUDA16GB VRAM

arnav080/sglang-memory-stress

Name: arnav080/sglang-memory-stress
Author: arnav080

Stress test recipe for SGLang Docker engine configuring detailed memory, batching, and routing constraints

Configuration Specifications

Base Model	huggingface:Qwen/Qwen2.5-7B-Instruct
Engine	llama.cpp
Quantization	Q4_K_M
Platform	cuda
VRAM Required	16GB minimum

No runs recorded yet

Be the first to benchmark this recipe! Run the CLI command from your terminal:

bloc run arnav080/sglang-memory-stress

sglang-memory-stress.yaml

Loading workspace editor...