Back to registry
CUDA320GB VRAM

arnav080/step-3-7-flash-bf16-multi-gpu

Unquantized BF16 Step-3.7-Flash multimodal recipe optimized for multi-GPU server clusters (e.g., 4x A100/H100 80GB).

Configuration Specifications

Base Modelhuggingface:stepfun-ai/Step-3.7-Flash
Enginellama.cpp
QuantizationBF16
Platformcuda
VRAM Required320GB minimum

Telemetry Benchmarks

No runs recorded yet

Be the first to benchmark this recipe! Run the CLI command from your terminal:

bloc run arnav080/step-3-7-flash-bf16-multi-gpu
step-3-7-flash-bf16-multi-gpu.yaml
Loading workspace editor...