Skip to content

v2/parser: prefer original comment when evaluating aliased Type #294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

shashankram
Copy link
Contributor

@shashankram shashankram commented May 13, 2025

Fixes the comment parsing on a Type so that it does not get overwritten by an aliased Type with a different comment.
Resolves #292

@k8s-ci-robot k8s-ci-robot added do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 13, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: shashankram
Once this PR has been reviewed and has the lgtm label, please assign wojtek-t for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested a review from thockin May 13, 2025 22:39
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 13, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @shashankram. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 13, 2025
Fixes the comment parsing on a Type so that it does not
get overwritten by an aliased Type with a different comment.

Signed-off-by: Shashank Ram <[email protected]>
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label May 13, 2025
@shashankram
Copy link
Contributor Author

/assign @yongruilin

@shashankram
Copy link
Contributor Author

@yongruilin @liggitt this is ready for review. I updated the code to not overwrite the original type's comments from an aliased type when gotypesalias is enabled, and to produce deterministic outputs when aliases are not supported. I am not too concerned about the latter because gotypesalias is enabled by default starting 1.23, and we can provide correctness of comments there, and a deterministic output otherwise.

// If this is a Type Alias, avoid updating the comments on the original
// Type t since walkType always returns the original type.
if !isTypeAlias(obj.Type()) {
p.addCommentsToType(obj, t)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to reason through what it means that alias types can no longer have comments attached… I agree we don't want them stomping the original type's comments but I'm not sure the implications of ignoring the alias' comments.

At the very least, before merging this, we should pull it into a speculative dependency update PR in kubernetes/kubernetes and double check if/how this impacts type generation there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

walkType always returns an unaliased type for an Alias type using walkAliasType which resolves the type using gotypes.Unalias(t), so we didn't add comments on an aliased type before either. Instead, it would clobber the comments on the original type.

Copy link
Contributor Author

@shashankram shashankram May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt I tested codegen in k/k kubernetes/kubernetes@master...shashankram:kubernetes:gengo-test

Confirmed that the openapi-gen version installed in _output prints the version built from my fork of kube-openapi which uses the gengo changes from this PR's branch:

./_output/local/go/bin/openapi-gen 
TEST BINARY
2025/05/14 16:34:39 Arguments validation error: --output-dir must be specified

Copy link
Member

@liggitt liggitt May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm... I guess there were these possible interactions before:

  1. generating packages containing both the alias and the original, the comments on the original would be clobbered; the generation of the original would use the alias' comments (bad) and the generation of the alias would use the alias' comments (good)
  2. generating a single package containing an alias, but not generating the package containing the original; the comments on the original type would be clobbered (bad, but non-impacting because the original wasn't being generated), but the generation of the alias would use the alias' comments (good)
  3. generating multiple packages / aliases to an original type, but not generating the package containing the original; the comments on the original type would be clobbered by a non-deterministic alias (bad, but non-impacting because the original wasn't being generated), but the generation of all the aliases would use a non-deterministic alias' comments (good for that one, bad for the others)

So previous logic:

  1. mixed good/bad (good for alias, bad for original)
  2. good (single alias uses its own comments to generate)
  3. mixed good/bad (good for one alias, bad for other aliases)

With this PR:

  1. mixed good/bad (good for original, bad for alias)
  2. bad (single alias has its comments ignored)
  3. bad (all aliases have their comments ignored)

It's hard to say this is strictly better. Can we make all three scenarios better, isolating or shallow-copying the object resolved from aliases so that each alias can set its own comments without impacting the original? Dry-running anything we try here through all the generators in kubernetes/kubernetes is a helpful test to see if we're disrupting anything that is working well now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good suggestion. I'll explore that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the sorted import processing resolves the flaking codegen, then a warning may be acceptable to begin with. Though I think it would be a breaking change anyway as the sorted import processing during generation could change the generated openapi comment descriptions and validation constraints? If codegen still flakes, then a warning wouldn't suffice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we warn and avoid overwriting, that should prevent flakes, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding overwriting might not be preventable just from your suggestion in https://github.com/kubernetes/gengo/pull/294/files#discussion_r2091473215. To truly avoid overwriting, we need to reliably detect that a type is an Alias and it should not overwrite the original type. I am not sure if we will always evaluate an alias after the original type has been parsed. How do you propose we do that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we see a mismatch:

  • if the current object is an alias, warn and ignore
  • if the current object is not an alias, warn and stomp

Something like this:

```diff --git a/v2/parser/parse.go b/v2/parser/parse.go
index 4c1efa00104..1f02613291a 100644
--- a/v2/parser/parse.go
+++ b/v2/parser/parse.go
@@ -24,6 +24,7 @@ import (
 	"go/token"
 	gotypes "go/types"
 	"path/filepath"
+	"reflect"
 	"sort"
 	"strings"
 	"time"
@@ -385,8 +386,61 @@ func (p *Parser) NewUniverse() (types.Universe, error) {
 // addCommentsToType takes any accumulated comment lines prior to obj and
 // attaches them to the type t.
 func (p *Parser) addCommentsToType(obj gotypes.Object, t *types.Type) {
-	t.CommentLines = p.docComment(obj.Pos())
-	t.SecondClosestCommentLines = p.priorDetachedComment(obj.Pos())
+	if newLines, oldLines := p.docComment(obj.Pos()), t.CommentLines; len(newLines) > 0 {
+		switch {
+		case len(oldLines) == 0, reflect.DeepEqual(oldLines, newLines):
+			// no comments associated, or comments match exactly
+			t.CommentLines = newLines
+
+		case isTypeAlias(obj.Type()):
+			// ignore mismatched comments from obj because it's an alias
+			klog.Warningf(
+				"Mismatched comments seen for type %v.\n  Using comments:\n    %v\n  Ignoring comments from type alias %v:\n    %v",
+				t.GoType,
+				strings.Join(oldLines, "\n    "),
+				obj.Type(),
+				strings.Join(newLines, "\n    "),
+			)
+
+		case !isTypeAlias(obj.Type()):
+			// overwrite existing comments with ones from obj because obj is not an alias
+			t.CommentLines = newLines
+			klog.Warningf(
+				"Mismatched comments seen for type %v.\n  Using comments:\n    %v\n  Ignoring comments from type aliases:\n    %v",
+				t.GoType,
+				strings.Join(newLines, "\n    "),
+				strings.Join(oldLines, "\n    "),
+			)
+		}
+	}
+
+	if newLines, oldLines := p.priorDetachedComment(obj.Pos()), t.SecondClosestCommentLines; len(newLines) > 0 {
+		switch {
+		case len(oldLines) == 0, reflect.DeepEqual(oldLines, newLines):
+			// no comments associated, or comments match exactly
+			t.SecondClosestCommentLines = newLines
+
+		case isTypeAlias(obj.Type()):
+			// ignore mismatched comments from obj because it's an alias
+			klog.Warningf(
+				"Mismatched secondClosestCommentLines seen for type %v.\n  Using comments:\n    %v\n  Ignoring comments from type alias %v:\n    %v",
+				t.GoType,
+				strings.Join(oldLines, "\n    "),
+				obj.Type(),
+				strings.Join(newLines, "\n    "),
+			)
+
+		case !isTypeAlias(obj.Type()):
+			// overwrite existing comments with ones from obj because obj is not an alias
+			t.SecondClosestCommentLines = newLines
+			klog.Warningf(
+				"Mismatched secondClosestCommentLines seen for type %v.\n  Using comments:\n    %v\n  Ignoring comments from type aliases:\n    %v",
+				t.GoType,
+				strings.Join(newLines, "\n    "),
+				strings.Join(oldLines, "\n    "),
+			)
+		}
+	}
 }
 
 // packageDir tries to figure out the directory of the specified package.
@@ -557,7 +611,11 @@ func (p *Parser) priorCommentLines(pos token.Pos, lines int) *ast.CommentGroup {
 }
 
 func splitLines(str string) []string {
-	return strings.Split(strings.TrimRight(str, "\n"), "\n")
+	lines := strings.Split(strings.TrimRight(str, "\n"), "\n")
+	if len(lines) == 1 && lines[0] == "" {
+		return nil
+	}
+	return lines
 }
 
 func goFuncNameToName(in string) types.Name {
diff --git a/v2/parser/parse_122.go b/v2/parser/parse_122.go
index ec2064958a9..4b2c458c4c6 100644
--- a/v2/parser/parse_122.go
+++ b/v2/parser/parse_122.go
@@ -31,3 +31,8 @@ func (p *Parser) walkAliasType(u types.Universe, in gotypes.Type) *types.Type {
 	}
 	return nil
 }
+
+func isTypeAlias(in gotypes.Type) bool {
+	_, isAlias := in.(*gotypes.Alias)
+	return isAlias
+}
\ No newline at end of file
diff --git a/v2/parser/parse_pre_122.go b/v2/parser/parse_pre_122.go
index 6f62100c0a7..0249703b545 100644
--- a/v2/parser/parse_pre_122.go
+++ b/v2/parser/parse_pre_122.go
@@ -28,3 +28,7 @@ import (
 func (p *Parser) walkAliasType(u types.Universe, in gotypes.Type) *types.Type {
 	return nil
 }
+
+func isTypeAlias(in gotypes.Type) bool {
+	return false
+}
\ No newline at end of file

Copy link
Contributor Author

@shashankram shashankram May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying, I misunderstood your earlier comment as a suggestion to ignore alias detection.

if the current object is an alias, warn and ignore

This would alter the generated schema and markers where it picked the Alias before. Is that acceptable now?

Signed-off-by: Shashank Ram <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect Descriptions are generated for an Aliased Type defined in multiple API versions
4 participants