1-command deployments·100% local and open-source·Community-powered AI infrastructure

The Docker For
Local AI Deployments.

Discover and deploy production-ready local AI recipes in minutes. Run identical llama.cpp environments shared by the community with one command. Any model, any hardware. Fully local and reproducible.

bloc run

What is Bloc?

Bloc Hub is an open platform for sharing and deploying reproducible local AI environments. Developers can publish complete llama.cpp recipes including models, quantization settings, runtimes, GPU optimizations, and configurations, while others can instantly run the exact same setup using a single CLI command. Instead of manually recreating local AI stacks from scattered guides and configs, Bloc Hub makes deployments portable, reproducible, and community-driven.

Build, Share, Run – instantly

Bloc Hub helps teams share reproducible local AI deployments from prototype to production. Discover community-built llama.cpp recipes, deploy identical environments instantly, and continuously improve performance across models, hardware, and runtimes.

Build

Create a reproducible recipe with your models, configs, runtimes, and optimizations.

Publish to Bloc Hub and share with the community.

Run

Anyone can run the exact same environment with one command.

Everything needed to run local AI.

Deploy community-built llama.cpp recipes, reproduce exact environments, and optimize performance from prototype to production.

Auto-Hardware Prober

Automatically checks CPU instructions, VRAM thresholds, and GPU capabilities to guide configuration options.

Zero-Dependency Setup

Runtimes, libraries, and optimized model runners are pre-packaged and managed dynamically without manual build config.

OpenAI-Compatible Endpoint

Instantly serves endpoints at localhost:8080/v1. Drop in existing chat UI SDKs, LangChain, or Autogen scripts.

Granular Layer Offloading

Control layers in VRAM vs RAM to balance speed and memory usage.

Telemetry Benchmarks

Opt-in speed profiles verify tokens/sec across specific host chips.

Multi-Format Support

Compatible with GGUF, EXL2, AWQ, and custom quantized manifests.

Offline Sovereignty

Runs 100% offline, guaranteeing no text data leaves the host environment.

Ready-to-deploy local AI setups

Explore curated local AI recipes optimized for different models, GPUs, runtimes, and workloads deployable instantly with a single command.

New recipes are added constantly by developers, researchers, and local AI enthusiasts. Browse the full collection and find the setup that fits your workflow.

Built by the local AI community.

Share your own recipes, improve existing deployments, and collaborate on open infrastructure powering the next generation of local AI.

Local AI Recipes

Browse, customize, and share optimized llama.cpp recipes configured for different host hardware. Fully version-controlled and reproducible.

Next.js Web Hub

A collaborative registry and web interface to explore community-built setups, track model downloads, and manage remote endpoints.

Developer CLI Tool

A lightweight terminal client to run, benchmark, and serve local AI environments instantly with a single offline command.

Your next local AI stack is
one command away.

Frequently Asked Questions

What is a local AI recipe?+

A recipe is a pre-configured, reproducible environment descriptor that defines the model weight quantization, llama.cpp startup parameters, system prompts, and required hardware profile (CPU, Metal, or specific CUDA setups) to run an LLM optimally.

How do I run a recipe?+

Install the Bloc CLI using npm or brew, then run 'bloc run <recipe-name>'. The CLI pulls the optimized weights and handles dependencies, starting a local API server in seconds.

Is Bloc Hub fully open-source and local?+

Yes. Bloc Hub is built entirely on open infrastructure. All models run completely on your own machine without sending data to external APIs, ensuring absolute privacy.

What hardware is supported?+

We support Apple Silicon (Metal acceleration), Nvidia GPUs (CUDA), AMD GPUs (ROCm), and standard CPU-only configurations. The registry automatically matches recipes to your hardware.

How can I share my own recipe?+

Simply define your hardware targets and run parameters in a 'bloc.yaml' file and use the 'bloc publish' command to share it with the registry.

The local infrastructure for collaborative team intelligence. Plug-and-play AI that stays in your office.

The Docker For
Local AI Deployments.

What is Bloc?

Build, Share, Run – instantly

Build

Share

Run

Everything needed to run local AI.

Auto-Hardware Prober

Zero-Dependency Setup

OpenAI-Compatible Endpoint

Granular Layer Offloading

Telemetry Benchmarks

Multi-Format Support

Offline Sovereignty

Ready-to-deploy local AI setups

Built by the local AI community.

Local AI Recipes

Next.js Web Hub

Developer CLI Tool

Your next local AI stack is
one command away.

Frequently Asked Questions

Product

Developers

Repositories

The Docker For Local AI Deployments.

What is Bloc?

Build, Share, Run – instantly

Build

Share

Run

Everything needed to run local AI.

Auto-Hardware Prober

Zero-Dependency Setup

OpenAI-Compatible Endpoint

Granular Layer Offloading

Telemetry Benchmarks

Multi-Format Support

Offline Sovereignty

Ready-to-deploy local AI setups

Built by the local AI community.

Local AI Recipes

Next.js Web Hub

Developer CLI Tool

Your next local AI stack is one command away.

Frequently Asked Questions

Product

Developers

Repositories

The Docker For
Local AI Deployments.

Your next local AI stack is
one command away.