Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] S3-SQS source pipelines can get stuck when SQS is reading from multiple buckets and one bucket has permissions issues #3930

Open
travisbenedict opened this issue Jan 9, 2024 · 1 comment
Labels
bug Something isn't working enhancement New feature or request

Comments

@travisbenedict
Copy link
Contributor

Describe the bug
S3-SQS source pipelines can get stuck when SQS is reading from multiple buckets and one bucket has permissions issues.

To Reproduce
Steps to reproduce the behavior:

  1. Create an SQS queue
  2. Create 2 S3 buckets and configure event notifications for the SQS queue in step 1
  3. Create an IAM role with permissions needed to get objects from one of the S3 buckets (call it Bucket1)
  4. Create an S3-SQS pipeline using the SQS queue and IAM role
  5. Upload objects to Bucket1, the pipeline should process them correctly
  6. Upload objects to Bucket 2, the pipeline will get 403 errors and retry
  7. Upload objects to Bucket1, these will not be processed because the pipeline is still trying to process objects from Step 6

Expected behavior
If a pipeline hits access denied errors with one bucket it should still process objects from the other bucket

@dlvenable
Copy link
Member

If Data Prepper fails to read the S3 object, it will keep the SQS message in the SQS queue. This is intentional. Users should configure an SQS redrive policy to an SQS DLQ. After, say 5, failed attempts SQS will automatically put this in an SQS DLQ.

I think we can do a few things to help with this:

  1. When a failure like this occurs, change the visibility timeout to some reasonable value (say 5 minutes). This way, the message does not become available too soon.
  2. Improve the documentation to direct users to use an SQS redrive policy and DLQ.

Another question: In this use-case, is it intentional that the user does not have access to Bucket 2? If the user will never have access, we could add some additional configurations to ignore certain buckets and/or key prefixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
Development

No branches or pull requests

3 participants