-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[DO NOT MERGE] Validate Beam 2.71.0rc1 #3204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @damccorm, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request is dedicated to validating the upcoming Apache Beam 2.71.0 release candidate. It involves a comprehensive update of both Java and Python dependencies to align with the new Beam version, ensuring compatibility and leveraging the latest features and fixes. The changes span across various project modules, primarily focusing on dependency version bumps and the introduction of new transitive dependencies in Python environments. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3204 +/- ##
==========================================
Coverage 50.86% 50.87%
- Complexity 5133 5513 +380
==========================================
Files 976 976
Lines 60090 60090
Branches 6572 6572
==========================================
+ Hits 30564 30569 +5
+ Misses 27374 27372 -2
+ Partials 2152 2149 -3
🚀 New features to boost your workflow:
|
|
Currently what I am seeing is that all tests are passing except some spanner related tests. These are failing with: I'm not currently seeing anything in the logs that gives much of a hint about what is happening. I have confirmed that if I remove @Abacn have you seen anything like this while doing this upgrade before? If not, I will keep looking |
|
In the past there were situations that spanner tests due to breaking changes in Spanner client that get upgraded as GCP BOM upgrade happened in Beam. "Timeout in polling result file: gs://dataflow-staging-us-west2-269744978479/staging/template_launches/2026-01-08_12_48_39-9625519560852535969/operation_result." This sounds template launcher failed.
Sounds like pipeline expansion gets stuck. if we can reproduce it locally then could get a hint. Also reach out to Spanner CDC team to investigate. |
Thanks, this is good context - was already starting on the other steps here, thanks! |
|
Interestingly, I'm not able to repro the problem. For example, here is the same pipeline that fails as a template running with the 2.71.0 rc version - https://pantheon.corp.google.com/dataflow/jobs/us-central1/2026-01-12_09_09_16-11282152439616596575;step=;graphView=0?project=apache-beam-testing&pageState=(%22dfTime%22:(%22l%22:%22dfJobMaxTime%22))&e=13802955&mods=dm_deploy_from_gcs vs the same pipeline running in a template: https://pantheon.corp.google.com/dataflow/jobs/us-central1/2026-01-12_08_29_11-15499653623879337782;logsSeverity=INFO;graphView=1?project=apache-beam-testing&pageState=(%22dfTime%22:(%22l%22:%22dfJobMaxTime%22))&e=13802955&mods=dm_deploy_from_gcs I have pretty definitively proven that something with the spanner change streams connector is the problem though. The minimal repro I have right now is changing the run method in SpannerChangeStreamsToBigQuery.java to: I can then launch a template with: This successfully launches the template when I don't change |
|
I've narrowed the problem to this line: https://github.com/apache/beam/blob/d0223389f47f8085477e113391d7ae5961bc0bbc/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java#L2105 Specifically, having: in the template run method causes hangs. Interestingly, we still log after this, so it seems like some resource is not getting cleaned up correctly |
Try to revert com.google.cloud:google-cloud-spanner to 6.95.0 for templates? if it worked we then need to revert #36995 (and the line in #37142 s well) to unblock the release, and ask Spanner team to take a further look
Log agent runs in a parallel container inside launcher VM and it is not affected by stuck pipeline expansion EDIT:
Actually it can be reproduced: Abacn@26c4218 Running with IDEA, there are threads preventing main method exiting:
|
|
Update: confirmed that downgrading My minimum reproduce: build.gradle main function:
I think we need to revert this line and build a new RC before that we can override implementation("com.google.cloud:google-cloud-spanner:6.104.0") dependency in DataflowTemplates to use 6.104.0 (declare it in pom, and exclude it as a transient dep for io-google-cloud-platform) to confirm |
|
Yep, this showed up for me as well. Thanks!
This is a good find, thanks for digging in. I was guessing we weren't closing connections but hadn't had time to dig in yet. We could probably consider rolling forward instead. I did a quick audit and the only places this is misused are:
|
|
I'll put together a PR for tomorrow; if we can avoid diverging from the BOM version, I do prefer that. |
|
@rahul2393 can you please check this issue? my primary suspect is related to enabling gRPC gcp by default. |
Actually, a roll forward isn't as easy as I thought. It is straightforward in SpannerIO, but in DaoFactory there are several places where we initialize a SpannerAccessor, then use it to create a DB client which we continue to pass around beyond the scope of the method. So we can't just call close there, we need to close the connection only when we're tearing down the calling context. For now, I'll just revert the spanner version, but we'll need more investigation |
|
apache/beam#37305 for the revert in Beam |
|
Validating RC2 here - #3222 - I'll leave this one open for now though for investigation |

No description provided.