Skip to content

[Feature Request]: Allow a flag so that "with pipeline" will not wait_until_finish so that this paradigm can be used for streaming jobs #29440

@lazarillo

Description

@lazarillo

What would you like to happen?

Currently, using the Python SDK, if the preferred with beam.Pipeline() as pipeline: context manager is used, a streaming job will never finish and will hang the task.

I am using Apache Beam within a CI/CD pipeline on Google's Dataflow. I want the deploy agent to launch the job and then complete. Then I can use the logs that I am receiving and Dataflow's GUI to monitor the job. Leaving the agent hung provides no value. For now, I am not using the context menu, but I prefer to use it as a best practice, especially since __exit()__ does more than simply wait_until_finish(), and as Beam progresses, it will likely continue to do more.

However, I understand the need for the CLI job to "hang" in many instances. Therefore, maybe an option can be added to StandardOptions to flag a job as "releaseable" (I'm sure there is a better flag name), then a small if clause can be placed here to only wait_until_finish when the flag is not set.

I am happy to help with this implementation, assuming the core team agrees with this approach.

Issue Priority

Priority: 2 (default / most feature requests should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions