CUDA192GB VRAM

arnav080/step-3-7-flash-nvfp4

Name: arnav080/step-3-7-flash-nvfp4
Author: arnav080

Step 3.7 Flash MoE (198B) quantized to 4-bit (NVFP4) optimized for dual RTX 6000 GPUs

Configuration Specifications

Base Model	huggingface:stepfun-ai/Step-3.7-Flash-NVFP4
Engine	llama.cpp
Quantization	modelopt
Platform	cuda
VRAM Required	192GB minimum

No runs recorded yet

Be the first to benchmark this recipe! Run the CLI command from your terminal:

bloc run arnav080/step-3-7-flash-nvfp4

step-3-7-flash-nvfp4.yaml

Loading workspace editor...