Skip to content

Return Code: 1, CMD #5

@zhaoanbei

Description

@zhaoanbei

Hi
Sorry to interrupt. I tried to run this repo on sagemaker with code below:

estimator = Estimator(role=role,
train_instance_count=1,
train_instance_type=instance_type,
image_name=ecr_image
)

estimator.fit({'training':'s3_path','checkpoint':'s3_path/faster_rcnn_inception_v2_coco_2018_01_28/'})

And error:
Exception during training: Return Code: 1, CMD: ['/usr/local/bin/python', '/opt/ml/code/tensorflow-models/research/object_detection/model_main.py', '--model_dir', '/opt/ml/model', '--pipeline_config_path', '/opt/ml/input/data/training/pipeline.config', '--num_train_steps', '100']
Traceback (most recent call last):
File "/opt/ml/code/train", line 83, in
commandline_util.run_python_script(training_script, default_params)
File "/opt/ml/code/utils/commandline_util.py", line 34, in run_python_script
run(script_cmd)
File "/opt/ml/code/utils/commandline_util.py", line 27, in run
raise Exception(error_msg)
Exception: Return Code: 1, CMD: ['/usr/local/bin/python', '/opt/ml/code/tensorflow-models/research/object_detection/model_main.py', '--model_dir', '/opt/ml/model', '--pipeline_config_path', '/opt/ml/input/data/training/pipeline.config', '--num_train_steps', '100']

Is there any suggestions?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions