CUDA8GB VRAM

arnav080/qwen3.6-35b-moe-tq3-rtx4060ti-no-checkpoint-witch

Name: arnav080/qwen3.6-35b-moe-tq3-rtx4060ti-no-checkpoint-witch
Author: arnav080

Highly optimized 8GB GPU recipe for Qwen 3.6 35B MoE, disabling context checkpoints to double deep-context throughput (~30 tok/s at 26k).

Configuration Specifications

Base Model	huggingface:Qwen/Qwen3.6-35B-A3B-Instruct
Engine	llama.cpp
Quantization	TQ3_4S
Platform	cuda
VRAM Required	8GB minimum

No runs recorded yet

Be the first to benchmark this recipe! Run the CLI command from your terminal:

bloc run arnav080/qwen3.6-35b-moe-tq3-rtx4060ti-no-checkpoint-witch

qwen3.6-35b-moe-tq3-rtx4060ti-no-checkpoint-witch.yaml

Loading workspace editor...