Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/bidewio/better-openclaw/llms.txt

Use this file to discover all available pages before exploring further.

Enable GPU acceleration for AI services that support NVIDIA GPUs. This significantly improves inference speed for local models.

Prerequisites

NVIDIA GPU

Verify you have a compatible NVIDIA GPU:
# Check GPU
lspci | grep -i nvidia

# Verify NVIDIA driver
nvidia-smi
Expected output:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03   Driver Version: 535.129.03   CUDA Version: 12.2   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  NVIDIA RTX 4090     Off  | 00000000:01:00.0 Off |                  N/A |
| 30%   45C    P8    25W / 450W |      1MiB / 24564MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

NVIDIA Container Toolkit

Install the NVIDIA Container Toolkit to enable GPU access in Docker containers.
# Add NVIDIA package repository
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

# Install toolkit
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

# Configure Docker runtime
sudo nvidia-ctk runtime configure --runtime=docker

# Restart Docker
sudo systemctl restart docker

Verify installation

Test GPU access from Docker:
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
If successful, you’ll see the nvidia-smi output inside the container.

Enabling GPU passthrough

Generate a stack with GPU support:
npx create-better-openclaw
# Select "Enable GPU passthrough" when prompted
The --gpu flag automatically adds GPU device reservations to services that support it.

GPU-enabled services

Required GPU

These services require a GPU to function:
ServiceDescriptionMemory Required
Stable DiffusionAI image generation~4 GB VRAM

Optional GPU

These services work without GPU but benefit from acceleration:
ServiceDescriptionGPU Benefit
OllamaLocal LLM inference5-10x faster inference
WhisperSpeech-to-text3-5x faster transcription
ComfyUINode-based AI workflowsFaster image generation

Docker Compose configuration

When you enable --gpu, better-openclaw adds GPU device reservations to docker-compose.yml:
docker-compose.yml
services:
  ollama:
    image: ollama/ollama:0.17.0
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    # ... rest of service config

  stable-diffusion:
    image: ghcr.io/stable-diffusion-webui/stable-diffusion-webui:latest-cuda
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    # ... rest of service config

Limiting GPU access

To restrict GPU access to specific devices:
deploy:
  resources:
    reservations:
      devices:
        - driver: nvidia
          device_ids: ['0']  # Only GPU 0
          capabilities: [gpu]
Or limit by count:
deploy:
  resources:
    reservations:
      devices:
        - driver: nvidia
          count: 1  # Only 1 GPU
          capabilities: [gpu]

Verifying GPU usage

Check container GPU access

# Enter Ollama container
docker compose exec ollama bash

# Check GPU visibility
nvidia-smi

Monitor GPU usage

Watch real-time GPU metrics:
# Continuous monitoring
watch -n 1 nvidia-smi

# Or with better formatting
nvidia-smi dmon -s pucvmet

Ollama GPU usage

When running models, Ollama displays GPU information:
docker compose exec ollama ollama run llama3.2

# Output shows GPU memory allocation:
# Loading model... 100%
# Model loaded on GPU 0 (8.2 GB / 24 GB VRAM)

Performance optimization

VRAM allocation

Allocate appropriate VRAM based on model size:
Model SizeMinimum VRAMRecommended VRAM
7B params6 GB8 GB
13B params12 GB16 GB
34B params24 GB32 GB
70B params48 GB64 GB

Multi-GPU configuration

For systems with multiple GPUs, better-openclaw enables all GPUs by default (count: all). To distribute services across GPUs:
  1. Manually edit docker-compose.yml to assign specific device IDs
  2. Use Docker Compose profiles to start services independently
# GPU 0 for Ollama
ollama:
  deploy:
    resources:
      reservations:
        devices:
          - driver: nvidia
            device_ids: ['0']
            capabilities: [gpu]

# GPU 1 for Stable Diffusion
stable-diffusion:
  deploy:
    resources:
      reservations:
        devices:
          - driver: nvidia
            device_ids: ['1']
            capabilities: [gpu]

Compute mode

Set GPU compute mode for exclusive access (recommended for production):
# Set exclusive process mode (one context per GPU)
sudo nvidia-smi -c EXCLUSIVE_PROCESS

# Or default shared mode (multiple contexts)
sudo nvidia-smi -c DEFAULT

Troubleshooting

GPU not detected

Check NVIDIA driver:
nvidia-smi
# If this fails, reinstall NVIDIA drivers
Verify Container Toolkit:
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Check Docker daemon config:
cat /etc/docker/daemon.json
Should contain:
{
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}

Out of memory errors

RuntimeError: CUDA out of memory
Solutions:
  1. Use smaller models:
    # Instead of llama3:70b, use llama3.2:7b
    ollama pull llama3.2:7b
    
  2. Enable model quantization:
    # Use 4-bit quantized models
    ollama pull llama3.2:7b-q4_0
    
  3. Reduce batch size or context length in service config

Driver/CUDA version mismatch

Error: CUDA driver version is insufficient for CUDA runtime version
Fix:
# Update NVIDIA driver
sudo ubuntu-drivers autoinstall
sudo reboot

# Or install specific version
sudo apt install nvidia-driver-535

Container can’t access GPU

failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory
Fix:
# Reinstall NVIDIA Container Toolkit
sudo apt-get purge nvidia-container-toolkit
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

# Restart stack
docker compose down
docker compose up -d

Monitoring with Grafana

Add GPU metrics to your monitoring stack:
npx create-better-openclaw \
  --services ollama,prometheus,grafana \
  --gpu \
  --monitoring \
  --yes
Install NVIDIA DCGM Exporter for GPU metrics:
docker-compose.yml
services:
  dcgm-exporter:
    image: nvcr.io/nvidia/k8s/dcgm-exporter:3.3.0-3.2.0-ubuntu22.04
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    environment:
      DCGM_EXPORTER_LISTEN: ":9400"
    ports:
      - "9400:9400"
    networks:
      - openclaw-network
    restart: unless-stopped
Add to prometheus.yml:
scrape_configs:
  - job_name: "gpu-metrics"
    static_configs:
      - targets: ["dcgm-exporter:9400"]

Cloud GPU providers

For cloud deployments with GPU support:
ProviderGPU OptionsNotes
AWS EC2P3, P4, G4, G5 instancesUse Ubuntu Deep Learning AMI
Google CloudA2, N1 with Tesla T4/V100Pre-installed NVIDIA drivers
AzureNC, ND, NV seriesContainer-optimized VM images
Vast.aiVarious consumer/datacenter GPUsPre-configured with Docker + NVIDIA toolkit
RunPodRTX 3090, A40, A100Docker and GPU support included
When deploying to cloud VMs, the NVIDIA drivers and Container Toolkit are often pre-installed. Verify with nvidia-smi before installing.