Skip to content

AbstractGlobalCombineFn hierarchy is inconsistent #19311

@kennknowles

Description

@kennknowles

Subclasses of AbstractGlobalCombineFn seem to be arranged in a way that prevents them from being used with the DataflowRunner. 

Subclasses of AbstractGlobalCombineFn are under either CombineFn or CombineFnWithContext, which seems to be in itself a CombineFn which has access to PipelineOptions and Side Inputs.

However, the DataflowRunner casts all combiners passed from user code to CombineFn (see [1]) which prevents combiners that extend CombineFnWithContext from being used there. 

 

For example: 
public class CustomCombinerFn extends CombineWithContext.CombineFnWithContext<...> {...}
 
final PCollectionView<SomeObject<String>> newCollection = oldCollection
  .apply("Custom Combiner", Combine.globally(new CustomCombinerFn(filter)) 
  .withSideInputs(filter)
  .withoutDefaults()
  .asSingletonView());
 
IMHO either CombineFnWithContext should be a subclass of CombineFn or DataflowRunner should cast the combiner to AbstractGlobalCombineFn. 
 
[1] 

Imported from Jira BEAM-6511. Original Jira may contain additional context.
Reported by: nlofeudo.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions