Back to registry
CUDA8GB VRAM
arnav080/qwen3.6-35b-moe-tq3-rtx4060ti-no-checkpoint-witch
Highly optimized 8GB GPU recipe for Qwen 3.6 35B MoE, disabling context checkpoints to double deep-context throughput (~30 tok/s at 26k).
Configuration Specifications
| Base Model | huggingface:Qwen/Qwen3.6-35B-A3B-Instruct |
| Engine | llama.cpp |
| Quantization | TQ3_4S |
| Platform | cuda |
| VRAM Required | 8GB minimum |
Telemetry Benchmarks
No runs recorded yet
Be the first to benchmark this recipe! Run the CLI command from your terminal:
bloc run arnav080/qwen3.6-35b-moe-tq3-rtx4060ti-no-checkpoint-witch
qwen3.6-35b-moe-tq3-rtx4060ti-no-checkpoint-witch.yaml
Loading workspace editor...