Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/s3_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ def check_if_s3_file_exists(bucket_name, s3_path):
try:
s3_client.head_object(Bucket=bucket_name, Key=s3_path)
except ClientError as e:
if e.response['Error']['Code'] == '404':
if e.response['Error']['Code'] in ['404', '403']:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the method is to check if the file exists in s3, if so then the caller appends variable version to the fiel name.

If there is no access then the method should raise exception as it was earlier.

Copy link
Author

@nit-practo nit-practo Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boto returns 403 when we check for the existence of file (if it does not exist)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
cc : @justjkk

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to this stackoverflow answer, s3 returns 403 instead of 404 if the ListBucket access is not present. In our case, the ListBucket access was present but restricted to a folder. I checked and found that the restriction was using StringEquals instead of StringLike which was preventing the ListBucket call from succeeding when applied on any path other than the top-level folder. I fixed the IAM permissions and it is now returning 404 as expected. However, it still makes sense to treat 403 also as a not found and not require the ListBucket permission.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final working IAM permissions for reference:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::<bucket-name>",
            "Condition": {
                "StringLike": {
                    "s3:prefix": "<db-name>/*"
                }
            }
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:AbortMultipartUpload",
                "s3:ListMultipartUploadParts"
            ],
            "Resource": "arn:aws:s3:::<bucket-name>/<db-name>/*"
        },
        {
            "Sid": "VisualEditor2",
            "Effect": "Allow",
            "Action": "s3:ListBucketMultipartUploads",
            "Resource": "arn:aws:s3:::<bucket-name>"
        }
    ]
}

Copy link

@anujith-singh anujith-singh Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justjkk @nit-practo the problem with treating 403 as 404 is:

If the IAM user does not have list/read access but has write access (unlikely but techinically possible) then

  • check_if_s3_file_exists() will return False
  • and get_usable_s3_path() will end up returning the path it just examined
  • and upload_to_s3() will end up overriding the file with a new version, which defeats the entire purpose of check_if_s3_file_exists()

Since as Kishore mentioned this is not a blocking problem any more(which we presumed earlier), we should update the ReadMe to have this document and call it done.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anujith-singh agreed with your reasoning.

return False
else:
raise e
Expand Down