What happened?
Sometimes, a global window side input takes too long to update on a Dataflow job.
The automatic model refresh feature of RunInference uses a pattern WatchFilePattern which uses a global windowed side input to fetch the latest model path. If the pipeline using this feature launches a dataflow job and there is an update to the model path, it could take long time to update the model since sometimes the workers are busy dealing with backlog.
The code I used to run is at https://github.com/apache/beam/blob/11e2bae4cbee4cc4f9d200a71511d921e8591dcd/examples/notebooks/beam-ml/automatic_model_refresh.ipynb
The work around I found is to increase the num_workers>1, I usually set it to 5.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components