Skip to content

Passed in session should be used for feature group ingestΒ #3332

Open
@sampoorna

Description

@sampoorna

Describe the bug
My default credentials do not have write access to S3 for a feature, so I need to assume a role using boto/STS. I create all the relevant clients, a SageMaker session, and a feature group using the sagemaker session. Then I call .ingest() on the feature group. I have to key in an MFA code when I first assume role. I would expect not to have to keep keying this in during the ingestion process as well, however, my script keeps prompting me for it over and over again.

I have traced this down to
sagemaker_featurestore_runtime_client = boto3.Session(profile_name=profile_name).client( service_name="sagemaker-featurestore-runtime", config=client_config ) in _ingest_single_batch(). It is not using the session that was passed in, but creating a new session, which prompts for an MFA code each time.

To reproduce

  1. Ensure "default" creds profile does not have access to write to FeatureGroup
  2. Ensure "<profile_name>" creds profile does have access to write to FeatureGroup. Set the policy to require multi-factor authentication.
  3. Create a boto session using the second profile, and initialise all the relevant clients:
sandbox_session = boto3.session.Session(profile_name='<profile_name>')
sagemaker_client = sandbox_session.client('sagemaker')
sm_runtime_client = sandbox_session.client('sagemaker-runtime')
sm_featurestore_client = sandbox_session.client('sagemaker-featurestore-runtime')
sagemaker_session = Session(
       sagemaker_client=sagemaker_client, 
       boto_session=sandbox_session, 
       sagemaker_featurestore_runtime_client=sm_featurestore_client,
       sagemaker_runtime_client=sm_runtime_client
)
feature_group = FeatureGroup(
      name=feature_group_name, sagemaker_session=sagemaker_session
)
  1. Attempt to ingest a dataframe to a FeatureGroup using .ingest() using profile_name (this technically should not even be required if the passed in session is being used, but the intention is to demonstrate that this isn't a viable workaround):
feature_group.ingest(
    data_frame=cleaned_and_transformed_df, max_workers=3, wait=True, profile_name='<profile_name>'
)
  1. User will be prompted for MFA code once when creating the first session, and then multiple times in succession (in parallel) during the ingestion.

Expected behavior
I expect the initial session would be used for ingestion, otherwise this API method is unusable if MFA is enabled on the assumed profile.

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.99.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): feature store
  • Python version: Python 3.8.10
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): Y

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions