Skip to content

Unify cuDF operators with a common base class architecture #16885

@coreylammie

Description

@coreylammie

Description

I would like to make a PR to unify the cuDF operators with a common base class.

Motivation

Currently, cuDF operators (CudfTopN, CudfLimit, CudfOrderBy, etc.) don't extend from a common base class. Each operator directly extends exec::Operator and NvtxHelper, leading to:

  • Code duplication (across operators)
  • Inconsistent debug logging patterns (some operators log, others don't)
  • No common type for passing operators - meaning, e.g., no direct access to NVTX/cuDF features without casting
  • Difficult to enforce consistent behavior across operators

Proposed Solution

Introduce a unified base class architecture with two complementary classes:

  1. CudfOperator - Lightweight base for user-defined GPU operators:
  1. CudfOperatorBase - Comprehensive base for built-in operators:
  • Extends from both CudfOperator and exec::Operator.
  • Implements template method pattern: overrides addInput(), getOutput(), noMoreInput(), and close() to call corresponding do* methods (doAddInput(), doGetOutput(), doNoMoreInput(), doClose())
  • Each wrapper method adds VELOX_NVTX_OPERATOR_FUNC_RANGE() profiling and VLOG(2) debug logging (guarded by CudfConfig::getInstance().debugEnabled)
  • Derived operators override only the do* methods they need (all have default implementations)
  • Enforces consistent NVTX profiling and debug logging across all built-in cuDF operators

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions