Updated All Externally Visible Logs/Docs from Runner V2 to Portable Runner#38532
Updated All Externally Visible Logs/Docs from Runner V2 to Portable Runner#38532TongruiLi wants to merge 8 commits into
Conversation
|
R: @shunping R: @scwhittle |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request standardizes the terminology used for Dataflow runners across the entire Apache Beam repository. By renaming 'Runner V2' to 'Portable Runner' and 'runner v1' to 'Streaming Java Runner', the changes improve consistency in logs, documentation, and error reporting. These updates affect multiple components, including Gradle build scripts, Go and Python SDKs, and internal protocol definitions, ensuring that the naming conventions are uniform throughout the system. Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment |
There was a problem hiding this comment.
Code Review
This pull request renames 'Runner V2' to 'Portable Runner' and 'Runner V1' to 'Streaming Java Runner' across the Java, Go, and Python SDKs, including documentation and build scripts. It also updates logic to prevent disabling the Portable Runner and adds tests for these cases. Feedback identifies a redundant word in a Java build file comment and an inconsistent error message in the Go SDK that still uses the old terminology.
| // TODO(https://github.com/apache/beam/issues/21472) | ||
| 'org.apache.beam.sdk.transforms.GroupByKeyTest$BasicTests.testAfterProcessingTimeContinuationTriggerUsingState', | ||
| // GroupIntoBatches.withShardedKey not supported on streaming runner v1 | ||
| // GroupIntoBatches.withShardedKey not supported on streaming Streaming Java Runner |
| return nil, errors.New("detected one of the following experiments: disable_runner_v2 | disable_runner_v2_until_2023 | disable_prime_runner_v2. Disabling runner v2 is no longer supported as of Beam version 2.45.0+") | ||
| // enable_portable_runner is not documented and hence wont be set by default. This will be fixed in later versions. | ||
| if strings.Contains(e, "disable_runner_v2") || strings.Contains(e, "disable_runner_v2_until_2023") || strings.Contains(e, "disable_prime_runner_v2") || strings.Contains(e, "disable_portable_runner") || strings.Contains(e, "enable_streaming_java_runner") { | ||
| return nil, errors.New("detected one of the following experiments: disable_runner_v2 | disable_runner_v2_until_2023 | disable_prime_runner_v2 | disable_portable_runner | enable_streaming_java_runner. Disabling runner v2 is no longer supported as of Beam version 2.45.0+") |
There was a problem hiding this comment.
For consistency with the rest of the changes in this PR and the Python SDK implementation, "runner v2" in the error message should be updated to "Portable Runner".
| return nil, errors.New("detected one of the following experiments: disable_runner_v2 | disable_runner_v2_until_2023 | disable_prime_runner_v2 | disable_portable_runner | enable_streaming_java_runner. Disabling runner v2 is no longer supported as of Beam version 2.45.0+") | |
| return nil, errors.New("detected one of the following experiments: disable_runner_v2 | disable_runner_v2_until_2023 | disable_prime_runner_v2 | disable_portable_runner | enable_streaming_java_runner. Disabling Portable Runner is no longer supported as of Beam version 2.45.0+") |
|
Portable runner in Beam has a wider meaning (https://beam.apache.org/roadmap/portability/). Like Prism is a portable runner, the FnAPI Python runner is also a portable runner, etc. Is it better if we emphasize Runner V2 as Dataflow Portable Runner? @TongruiLi |
scwhittle
left a comment
There was a problem hiding this comment.
Also address the gemini comments.
I think that it is good idea to use Dataflow Portable Runner in contexts that it isn't immediately clear or for docs. I think that code comments in dataflow files are likely clear enough without that qualifier.
|
|
||
| opts, err := getJobOptions(context.Background(), false) | ||
|
|
||
| if err == nil { |
There was a problem hiding this comment.
what are these testing? It seems it is verifying errors, but the test name doesn't indicate we expect a failure. If we do it woudl be better to verify the error more specifically too
ditto for the other test
| ℹ️ Note that cross-language transforms require | ||
| portable implementations of Spark/Flink/Direct runners. Dataflow requires | ||
| [runner V2](https://cloud.google.com/dataflow/docs/guides/deploying-a-pipeline#dataflow-runner-v2). | ||
| [Portable Runner](https://cloud.google.com/dataflow/docs/guides/deploying-a-pipeline#dataflow-runner-v2). |
There was a problem hiding this comment.
how about
Dataflow requires using the [Dataflow Portable Runner]
| userStepToStateFamilyNameMap: Map from user step names to state families. | ||
| userWorkerRunnerV1Settings: Binary encoded proto to control runtime | ||
| behavior of the java runner v1 user worker. | ||
| behavior of the Streaming Java Runner user worker. |
There was a problem hiding this comment.
think this file is generated
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/internal/clients/README.txt
All runner v2 references have been updated to portable runner, all runner v1 references have been updated to streaming java runner.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.