Skip to content

Reject VMM allocations in hipIpcGetMemHandle#266

Closed
zyzshishui wants to merge 1 commit intoROCm:developfrom
zyzshishui:patch-1
Closed

Reject VMM allocations in hipIpcGetMemHandle#266
zyzshishui wants to merge 1 commit intoROCm:developfrom
zyzshishui:patch-1

Conversation

@zyzshishui
Copy link

@zyzshishui zyzshishui commented Feb 20, 2026

Associated JIRA ticket number/Github issue number

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update
  • Continuous Integration

What were the changes?

Add VMM type check for hipIpcGetMemHandle

Why are these changes needed?

hipIpcGetMemHandle currently does not validate whether the input pointer was allocated via the VMM path (hipMemAddressReserve + hipMemMap). When a VMM-allocated pointer is passed:

  1. MemObjMap::FindMemObj finds the VMM sub-buffer (added by hipMemMap) since it shares MemObjMap_ with hipMalloc allocations
  2. Buffer::ExportHandle calls hsa_amd_ipc_memory_create on the VMM pointer, which expects hsa_amd_memory_pool_allocate-backed memory
  3. This produces an invalid hipIpcMemHandle_t
  4. The receiving process's hipIpcOpenMemHandle → hsa_amd_ipc_memory_attach fails with hipErrorInvalidDevicePointer
  5. On process cleanup, the inconsistent BO pin state triggers amdttm_bo_unpin WARNING followed by kernel BUG (slab corruption), leaving GPU VRAM permanently leaked until hard reboot

Updated CHANGELOG?

  • Yes
  • No, Does not apply to this PR.

Added/Updated documentation?

  • Yes
  • No, Does not apply to this PR.

Additional Checks

  • I have added tests relevant to the introduced functionality, and the unit tests are passing locally.
  • Any dependent changes have been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants