Add -DGGML_HIP_NO_VMM=OFF to fix ROCm OOM/segfault on APUs and RDNA 3.5+ by ElSnacko · Pull Request #94 · lemonade-sdk/llamacpp-rocm

ElSnacko · 2026-05-03T07:33:31Z

Problem

All current builds report VMM: no because llama.cpp defaults GGML_HIP_NO_VMM to ON in ggml/CMakeLists.txt:

option(GGML_HIP_NO_VMM "ggml: do not try to use HIP VMM" ON)

Without VMM, hipMalloc never releases GPU memory pages. On unified-memory APUs (gfx1103/780M) and RDNA 3.5/4 dGPUs, the allocator permanently holds flash-attn scratch and KV cache memory, eventually OOMing and segfaulting (exit 139).

This is the root cause of #87, #79, #86, #52.

Fix

Add -DGGML_HIP_NO_VMM=OFF to both the Windows and Ubuntu cmake configure steps. This enables the VMM-backed pool allocator (ggml_cuda_pool_vmm) which uses hipMemCreate/hipMemMap for on-demand allocation and hipMemRelease to return pages.

Note: The correct cmake flag is GGML_HIP_NO_VMM, not GGML_USE_VMM (which doesn't exist) or GGML_CUDA_NO_VMM (which only applies to CUDA builds). HIP VMM is controlled separately in ggml/src/ggml-hip/CMakeLists.txt and ggml/src/ggml-cuda/common.cuh.

Safety

Devices that don't support VMM fall back to the legacy hipMalloc pool automatically at runtime via hipDeviceAttributeVirtualMemoryManagementSupported. No behavior change for unsupported hardware.

Change

Two one-line additions to .github/workflows/build-llamacpp-rocm.yml:

Windows cmake block: -DGGML_HIP_NO_VMM=OFF ^
Ubuntu cmake block: -DGGML_HIP_NO_VMM=OFF \

Without VMM, hipMalloc never releases pages back to the GPU memory pool. On APUs with unified memory (e.g. gfx1103/780M) and RDNA 3.5/4 dGPUs with large VRAM, the allocator permanently holds onto scratch memory from flash-attn and KV cache, eventually OOMing and segfaulting (exit 139). With VMM enabled, llama.cpp uses cuMemCreate/cuMemMap for on-demand allocation and cuMemRelease to return pages. Devices that don't support VMM fall back to the legacy pool automatically at runtime, so this is safe for all GPU targets. Refs: lemonade-sdk#87, lemonade-sdk#79, lemonade-sdk#86, lemonade-sdk#52

HIP VMM is controlled by GGML_HIP_NO_VMM (default ON = disabled). The previous GGML_USE_VMM flag does not exist in llama.cpp's cmake.

danielholanda · 2026-05-14T23:37:41Z

@slojosic-amd Any thoughts here?

ElSnacko added 2 commits May 3, 2026 00:33

Fix VMM flag: use -DGGML_HIP_NO_VMM=OFF instead of -DGGML_USE_VMM=ON

50f64de

HIP VMM is controlled by GGML_HIP_NO_VMM (default ON = disabled). The previous GGML_USE_VMM flag does not exist in llama.cpp's cmake.

ElSnacko changed the title ~~Add -DGGML_USE_VMM=ON to fix ROCm OOM/segfault on APUs and RDNA 3.5+~~ Add -DGGML_HIP_NO_VMM=OFF to fix ROCm OOM/segfault on APUs and RDNA 3.5+ May 3, 2026

danielholanda requested a review from slojosic-amd May 14, 2026 23:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add -DGGML_HIP_NO_VMM=OFF to fix ROCm OOM/segfault on APUs and RDNA 3.5+#94

Add -DGGML_HIP_NO_VMM=OFF to fix ROCm OOM/segfault on APUs and RDNA 3.5+#94
ElSnacko wants to merge 2 commits into
lemonade-sdk:mainfrom
ElSnacko:fix/add-vmm-flag

ElSnacko commented May 3, 2026 •

edited

Loading

Uh oh!

danielholanda commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ElSnacko commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Safety

Change

Uh oh!

danielholanda commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ElSnacko commented May 3, 2026 •

edited

Loading