Back to registry
CUDA320GB VRAM
arnav080/step-3-7-flash-bf16-multi-gpu
Unquantized BF16 Step-3.7-Flash multimodal recipe optimized for multi-GPU server clusters (e.g., 4x A100/H100 80GB).
Configuration Specifications
| Base Model | huggingface:stepfun-ai/Step-3.7-Flash |
| Engine | llama.cpp |
| Quantization | BF16 |
| Platform | cuda |
| VRAM Required | 320GB minimum |
Telemetry Benchmarks
No runs recorded yet
Be the first to benchmark this recipe! Run the CLI command from your terminal:
bloc run arnav080/step-3-7-flash-bf16-multi-gpu
step-3-7-flash-bf16-multi-gpu.yaml
Loading workspace editor...