CUDA48GB VRAM

arnav080/ornstein3.6-27b-dual-3090-mtp

Name: arnav080/ornstein3.6-27b-dual-3090-mtp
Author: arnav080

Dual RTX 3090 setup running Ornstein3.6-27B dense model with Tensor Parallelism, 65k context, FP16 KV cache, and multi-speculative decoding (MTP + ngram-mod + ngram-map).

Configuration Specifications

Base Model	huggingface:saber-labs/Ornstein3.6-27B-MTP-NSC-ACE-SABER
Engine	llama.cpp
Quantization	Q4_K_M
Platform	cuda
VRAM Required	48GB minimum

Telemetry Benchmarks

No runs recorded yet

Be the first to benchmark this recipe! Run the CLI command from your terminal:

bloc run arnav080/ornstein3.6-27b-dual-3090-mtp

ornstein3.6-27b-dual-3090-mtp.yaml

Loading workspace editor...