Generate model archiver files for torchserve¶
Setup¶
- Your ~/.kube/config should point to a cluster with KServe installed.
- Your cluster's Istio Ingress gateway must be network accessible.
1. Create PV and PVC¶
Create a Persistent volume and volume claim. This document uses amazonEBS PV. For AWS EFS storage you can refer to AWS EFS storage
1.1 Create PV¶
Edit volume id in pv.yaml file
kubectl apply -f pv.yaml
Expected Output
$ persistentvolume/model-pv-volume created
1.2 Create PVC¶
kubectl apply -f pvc.yaml
Expected Output
$ persistentvolumeclaim/model-pv-claim created
2 Create model store files layout and copy to PV¶
We create a pod with the PV attached to copy the model files and config.properties for generating model archive file.
2.1 Create pod for copying model store files to PV¶
kubectl apply -f pvpod.yaml
Expected Output
$ pod/model-store-pod created
2.2 Create model store file layout on PV¶
2.2.1 Create properties.json file¶
This file has model-name, version, model-file name, serialized-file name, extra-files, handlers, workers etc. of the models.
[
{
"model-name": "mnist",
"version": "1.0",
"model-file": "",
"serialized-file": "mnist_cnn.pt",
"extra-files": "",
"handler": "mnist_handler.py",
"min-workers" : 1,
"max-workers": 3,
"batch-size": 1,
"max-batch-delay": 100,
"response-timeout": 120,
"requirements": ""
},
{
"model-name": "densenet_161",
"version": "1.0",
"model-file": "",
"serialized-file": "densenet161-8d451a50.pth",
"extra-files": "index_to_name.json",
"handler": "image_classifier",
"min-workers" : 1,
"max-workers": 3,
"batch-size": 1,
"max-batch-delay": 100,
"response-timeout": 120,
"requirements": ""
}
]
2.2.2 Copy model and its dependent Files¶
Copy all the model and dependent files to the PV in the structure given below. An empty config folder, a model-store folder containing model name as folder name. Within that model folder, the files required to build the marfile.
├── config
├── model-store
│ ├── densenet_161
│ │ ├── densenet161-8d451a50.pth
│ │ ├── index_to_name.json
│ │ └── model.py
│ ├── mnist
│ │ ├── mnist_cnn.pt
│ │ ├── mnist_handler.py
│ │ └── mnist.py
│ └── properties.json
2.2.3 Create folders for model-store and config in PV¶
kubectl exec -it model-store-pod -c model-store -n kserve-test -- mkdir /pv/model-store/
kubectl exec -it model-store-pod -c model-store -n kserve-test -- mkdir /pv/config/
2.3 Copy model files and config.properties to the PV¶
kubectl cp model-store/* model-store-pod:/pv/model-store/ -c model-store -n kserve-test
kubectl cp config.properties model-store-pod:/pv/config/ -c model-store -n kserve-test
2.4 Delete pv pod¶
Since amazon EBS provide only ReadWriteOnce mode, we have to unbind the PV for use of model archiver.
kubectl delete pod model-store-pod -n kserve-test
3 Generate model archive file and server configuration file¶
3.1 Create model archive pod and run model archive file generation script¶
kubectl apply -f model-archiver.yaml -n kserve-test
3.2 Check the output and delete model archive pod¶
Verify mar files and config.properties
kubectl exec -it margen-pod -n kserve-test -- ls -lR /home/model-server/model-store
kubectl exec -it margen-pod -n kserve-test -- cat /home/model-server/config/config.properties
3.3 Delete model archiver¶
kubectl delete -f model-archiver.yaml -n kserve-test