Skip to content

mmg10/torchserve_kserve

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Deploying a PyTorch Model using KServe

Step 1 : Convert to MAR format

Assuming you have a trained model, first convert it to torchscript format.

python 01_model_to_tscript.py

This will generate a traced_model.pt file.

Then, convert it to MAR format

# pip install torch-model-archiver
# git clone https://github.com/pytorch/serve
# cp cifar34_handler.py ts/torch_handler/
torch-model-archiver --model-name cifar34 --version 1.0 --serialized-file ./traced_model.pt --handler ./ts/torch_handler/cifar34_handler.py --extra-files ./index_to_name.json

This will generate a cifar34.mar file.

We will then upload the model file to s3 so it can be retrieved during inference.

├── config
│   ├── config.properties
├── model-store
│   └── cifar34.mar

Replace the S3 bucket name in intel-service.yaml

Step 1a: Test the MAR file/TorchServe locally

Copy the cifar34.mar file inside the docker folder

cp cifar34.mar docker/
docker build -t torchserve:01 .
docker run -it --rm --net=host torchserve:01

# see available models
curl http://localhost:8081/models

# download sample image
https://raw.githubusercontent.com/kshitijzutshi/INFO6105-CNN-Assignment/main/Intel-image-classification-dataset/seg_train/seg_train/mountain/153.jpg

curl http://127.0.0.1:8080/predictions/cifar34 -T 153.jpg

This will return top 5 predictions. This confirms that our MAR file has not only been successfully generated but is also functional.

Note The ports for torchserve are different in non-kserve and in-kserve environment.

Step 2 : Install Kubernetes/Kind/kserve

# kubectl
curl -LO https://dl.k8s.io/release/v1.25.1/bin/linux/amd64/kubectl
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

# kind
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.17.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind

# create cluster
kind create cluster  --image kindest/node:v1.24.1 --name kind
kind get clusters
kubectl cluster-info --context kind-kind
kubectl config use-context kind-kind


# verify
kubectl get service
kubectl get pod
kubectl get deployment

# kserve
curl -s "https://raw.githubusercontent.com/kserve/kserve/release-0.9/hack/quick_install.sh" | bash


kubectl get namespaces
kubectl get pods -n kserve

Step 2: Create the Service

kubectl apply -f intel-service.yaml
kubectl get pods
kubectl get isvc # once all are running

Step 3: Port-Forwarding

If you have a pre-processor (aka transformer), port-forward to istio and not the predictor pod since the data first needs to be preprocessed.

kubectl port-forward -n istio-system svc/istio-ingressgateway 8080:80

The latter will give incorrect results.

Else, if you don't have a transformer, you can either port-forward to istio or to the predictor pod

kubectl port-forward -n istio-system svc/istio-ingressgateway 8080:80
# OR
kubectl port-forward intel-predictor-default-00001-deployment-5bf7f9b9f6-mlgcv 8080:8080

Step 4: Evaluating...

You can now fetch predictions using a python script.

python test.py

or using a curl command

curl http://localhost:8080/v1/models/cifar34:predict -d @./input.json

whereas the file input.json can be generated via

python3 img2bytearray.py 153.jpg

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published