Back to registry
CUDA48GB VRAM
arnav080/ornstein3.6-27b-dual-3090-mtp
Dual RTX 3090 setup running Ornstein3.6-27B dense model with Tensor Parallelism, 65k context, FP16 KV cache, and multi-speculative decoding (MTP + ngram-mod + ngram-map).
Configuration Specifications
| Base Model | huggingface:saber-labs/Ornstein3.6-27B-MTP-NSC-ACE-SABER |
| Engine | llama.cpp |
| Quantization | Q4_K_M |
| Platform | cuda |
| VRAM Required | 48GB minimum |
Telemetry Benchmarks
No runs recorded yet
Be the first to benchmark this recipe! Run the CLI command from your terminal:
bloc run arnav080/ornstein3.6-27b-dual-3090-mtp
ornstein3.6-27b-dual-3090-mtp.yaml
Loading workspace editor...