CUDA16GB VRAM

arnav080/qwen3.6-35b-moe-rtx5080-128k-LeftCDev

Name: arnav080/qwen3.6-35b-moe-rtx5080-128k-LeftCDev
Author: arnav080

Fully GPU-accelerated Qwen 3.6 35B MoE recipe optimized for 16GB VRAM GPUs, achieving 150 tok/s and 128k context.

Configuration Specifications

Base Model	huggingface:Qwen/Qwen3.6-35B-A3B-Instruct
Engine	llama.cpp
Quantization	Q3_K_S
Platform	cuda
VRAM Required	16GB minimum

No runs recorded yet

Be the first to benchmark this recipe! Run the CLI command from your terminal:

bloc run arnav080/qwen3.6-35b-moe-rtx5080-128k-LeftCDev

qwen3.6-35b-moe-rtx5080-128k-LeftCDev.yaml

Loading workspace editor...