Back to registry
CUDA16GB VRAM
arnav080/qwen3.6-35b-moe-rtx5080-128k-LeftCDev
Fully GPU-accelerated Qwen 3.6 35B MoE recipe optimized for 16GB VRAM GPUs, achieving 150 tok/s and 128k context.
Configuration Specifications
| Base Model | huggingface:Qwen/Qwen3.6-35B-A3B-Instruct |
| Engine | llama.cpp |
| Quantization | Q3_K_S |
| Platform | cuda |
| VRAM Required | 16GB minimum |
Telemetry Benchmarks
No runs recorded yet
Be the first to benchmark this recipe! Run the CLI command from your terminal:
bloc run arnav080/qwen3.6-35b-moe-rtx5080-128k-LeftCDev
qwen3.6-35b-moe-rtx5080-128k-LeftCDev.yaml
Loading workspace editor...