Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[redhat|policy] Provide a URL message for S3 bucket based uploads #3891

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

TrevorBenson
Copy link
Member


Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

  • Is the commit message split over multiple lines and hard-wrapped at 72 characters?
  • Is the subject and message clear and concise?
  • Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
  • Does the commit contain a Signed-off-by: First Lastname [email protected]?
  • Are any related Issues or existing PRs properly referenced via a Closes (Issue) or Resolved (PR) line?
  • Are all passwords or private data gathered by this PR obfuscated?

Closes #3890

Copy link

Congratulations! One of the builds has completed. 🍾

You can install the built RPMs by following these steps:

  • sudo yum install -y dnf-plugins-core on RHEL 8
  • sudo dnf install -y dnf-plugins-core on Fedora
  • dnf copr enable packit/sosreport-sos-3891
  • And now you can install the packages.

Please note that the RPMs should be used only in a testing environment.

@TrevorBenson TrevorBenson force-pushed the sos-trevorbenson-upload-url-string branch from 5e10fe9 to e9deb52 Compare December 23, 2024 22:09
@TrevorBenson TrevorBenson force-pushed the sos-trevorbenson-upload-url-string branch from e9deb52 to c1fe344 Compare December 23, 2024 22:10
@jcastill jcastill added Status/RedHat QE RH QE has been requested to review Status/RedHat Eng RedHat Engineering has been requested to review Kind/RedHat RedHat related item labels Dec 24, 2024
Copy link
Member

@jcastill jcastill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Did a very quick test with non-S3 uploads and worked as expected. I'll test more thoroughly in a couple of days

@jcastill jcastill added the Reviewed/Other Ack Acknowledged by a member label Dec 24, 2024
Copy link
Member

@TurboTurtle TurboTurtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code LGTM, but I wonder if we wouldn't be better served if we instead change the logic that was generating the original problematic messaging in #3890. It would strike me as a little strange for the RH api path the be appended on to whatever endpoint/bucket path I'd otherwise be expecting.

Comment on lines 261 to 262
rh_case_api = "/support/v1/cases/%s/attachments"
return f"{endpoint}/{bucket}" + rh_case_api % self.case_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this all f-string formatting, mixing the two syntax is a little awkward.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll adjust this accordingly once I have additional feedback to #3891 (comment)

@TrevorBenson
Copy link
Member Author

TrevorBenson commented Dec 24, 2024

The code LGTM, but I wonder if we wouldn't be better served if we instead change the logic that was generating the original problematic messaging in #3890.

Unless I overlooked something this function is what ends up generating both parts of the message: the missing case id; the upload to Red Hat Secure FTP.

It would strike me as a little strange for the RH api path the be appended on to whatever endpoint/bucket path I'd otherwise be expecting.

Only included this as I think @jcastill mentioned RH potentially using an S3 bucket in the future. I presume he may be adding an RH_S3_HOST (aka an S3 Endpoint) in the work he is doing separately.

I want to be sure I understand your request. When both a case id and a bucket prefix are provided the resulting sos-collector tar archive will contain the case id, but the archive itself is only found inside the provided s3 object prefix? Or are you suggesting we drop the rh_case_api altogether and if the user provides a case id without providing an object prefix the tar file, with a case ID in its name, ends up in the root of the bucket?

Thanks in advance for any clarification.

@TurboTurtle
Copy link
Member

Let me answer this way, please let me know if I'm not making sense here.

As an end-user creating an sos report and uploading it to an S3 location that I've defined in my config file or in the sos command I'm running, I would expect:

  • No message about SFTP, regardless of it a case id was provided or not
  • No message about a Red Hat location
  • The sos report to end up in the endpoint/bucket I defined, and not nested under a path I did not specify.

On that last point that means I'd expect, with or without a case id, this path from #3890:

s3://project-sosreports/ba59c727-bfd2-4808-8f5f-f62c63903c13/2024/12 or, potentially, s3://project-sosreports/ba59c727-bfd2-4808-8f5f-f62c63903c13/2024/12/$case_id

and not

s3://project-sosreports/support/v1/cases/12345/attachments

If RH starts using an S3 bucket, we'd address any handling of that within the Red Hat upload methods so that it doesn't impact a user uploading to a non-RH S3 location in any way. We really need to separate upload from policy, and I'll try and spend some time on that this week and next - but for the purposes of this PR I'm mainly just concerned about the above - predictable and expected locations within a user-specified S3 destination.

@TrevorBenson
Copy link
Member Author

Makes sense and clarifies the case of no prefix with a case id, which will result in the path: s3://project-sosreports/$case_id.

I'll make the adjustments and force push a new commit for review.

@TrevorBenson
Copy link
Member Author

I thought providing an https:// based upload_url that includes the endpoint would be more informative. On further review I see why I used the s3:// url instead:

def _determine_upload_type(self):
"""Based on the url provided, determine what type of upload to attempt.
Note that this requires users to provide a FQDN address, such as
https://myvendor.com/api or ftp://myvendor.com instead of
myvendor.com/api or myvendor.com
"""
prots = {
'ftp': self.upload_ftp,
'sftp': self.upload_sftp,
'https': self.upload_https,
's3': self.upload_s3
}
if self.commons['cmdlineopts'].upload_protocol in prots:
return prots[self.commons['cmdlineopts'].upload_protocol]
if '://' not in self.upload_url:
raise Exception("Must provide protocol in upload URL")
prot, _ = self.upload_url.split('://')
if prot not in prots:
raise Exception(f"Unsupported or unrecognized protocol: {prot}")
return prots[prot]


The upload_s3 method doesn't rely on upload_url currently, so get_upload_url returning a string w/ case_id appended shows the Attempting upload to correctly, but its dropped by the upload process.

While appending the case_id to the object prefix seemed a simple fix to this I found on a freshly installed rhel 8.9 system get_upload_url() method of RHELPolicy is called multiple times, or recursively. This lead to the case_id being appended more than once when appending it directly to the s3 object prefix instead of the upload_url.

[RHELPolicy.get_upload_url] called
[RHELPolicy.get_upload_url] called
[RHELPolicy.get_upload_url] called
[RHELPolicy.get_upload_url] called
Attempting upload to https://sosreport.example.com/project-sosreports/ba59c727-bfd2-4808-8f5f-f62c63903c13/2024/12/12345/12345

I'm going to convert this PR to a draft while determining how best to implement the requests and if aligning s3 more closely to the other upload protocols logic is possible.

@TrevorBenson
Copy link
Member Author

TrevorBenson commented Jan 3, 2025

@TurboTurtle What do you think of the most recent commit? The thought is to fix the issue, clarify which endpoint is used in the string output, but to leave off the case id as a prefix (directory) for now. I figure this can be handled if/when Red Hat decides to implement an s3 bucket upload target.

@TrevorBenson TrevorBenson force-pushed the sos-trevorbenson-upload-url-string branch from 2dbdd7c to 272f2ca Compare January 3, 2025 19:42
If acceptable will fixup/squash so commit message and sign-off is proper.

Signed-off-by: Trevor Benson <[email protected]>
@TrevorBenson TrevorBenson force-pushed the sos-trevorbenson-upload-url-string branch from 272f2ca to 5583de7 Compare January 3, 2025 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Kind/RedHat RedHat related item Reviewed/Other Ack Acknowledged by a member Status/RedHat Eng RedHat Engineering has been requested to review Status/RedHat QE RH QE has been requested to review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When using upload protocol s3 on Red Hat incorrect upload_url_string is returned.
3 participants