Skip to content

[Bug]: RunInference with automatic refresh using a global window side inputs takes too long to update sometimes #28776

@AnandInguva

Description

@AnandInguva

What happened?

Sometimes, a global window side input takes too long to update on a Dataflow job.

The automatic model refresh feature of RunInference uses a pattern WatchFilePattern which uses a global windowed side input to fetch the latest model path. If the pipeline using this feature launches a dataflow job and there is an update to the model path, it could take long time to update the model since sometimes the workers are busy dealing with backlog.

The code I used to run is at https://github.com/apache/beam/blob/11e2bae4cbee4cc4f9d200a71511d921e8591dcd/examples/notebooks/beam-ml/automatic_model_refresh.ipynb

The work around I found is to increase the num_workers>1, I usually set it to 5.

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions