Description
Describe the bug
When Batch Transform pulls the model.tar.gz
archive, extracts, it fails to read the model properly. For example it tries to load unrelated files like requirements.txt
.
Model has been created with XGBoostEstimator
, then stored in ML Flow. Then, it's deployed to SageMaker with mlflow.sagemaker.deploy_transform_job
API.
To reproduce
Zip your ML Flow artifacts as model.tar.gz
, and put to S3. Then start a Batch Transform job, pointing to that model.
There will be errors that file could not be loaded as a model, but these files are unrelated, should not be even attempted to load.
[2023-01-17:10:40:22:INFO] Loading the model from /opt/ml/model/requirements.txt
[2023-01-17 10:40:22 +0000] [37] [ERROR] Exception in worker process | [2023-01-17 10:40:22 +0000] [37] [ERROR] Exception in worker process
2023-01-17T11:40:22.594+01:00 Traceback (most recent call last): File "/miniconda3/lib/python3.8/site-packages/sagemaker_xgboost_container/algorithm_mode/serve_utils.py", line 175, in get_loaded_booster booster = pkl.load(open(full_model_path, "rb"))
2023-01-17T11:40:22.594+01:00 pickle.UnpicklingError: invalid load key, 'm'. | _pickle.UnpicklingError: invalid load key, 'm'.
Expected behavior
Archive is unzipped and the model binary is loaded to XGBoost.
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 2.126.0
- Framework name (eg. PyTorch) or algorithm (eg. KMeans): XGBoost
- Framework version: 1.5-1
- Python version: 3.8
- CPU or GPU: CPU
- Custom Docker image (Y/N): N
Additional context
N/A