Deploy PMML model with InferenceService¶
PMML, or predictive model markup language, is an XML format for describing data mining and statistical
              models, including inputs to the models,
              transformations used to prepare data for data mining, and the parameters that define the models
              themselves. In this example we show how you can
              serve the PMML format model on InferenceService.
Create the InferenceService¶
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "pmml-demo"
spec:
  predictor:
    pmml:
      storageUri: gs://kfserving-examples/models/pmml
                  apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "pmml-demo"
spec:
  predictor:
    model:
      modelFormat:
        name: pmml
      storageUri: "gs://kfserving-examples/models/pmml"
                  Create the InferenceService with above yaml
kubectl apply -f pmml.yaml
            Expected Output
$ inferenceservice.serving.kserve.io/pmml-demo created
              Warning
The pmmlserver is based on Py4J and that
                doesn't support multi-process mode, so we can't set spec.predictor.containerConcurrency.
                If you want to scale the PMMLServer to improve prediction performance, you should set the
                InferenceService's resources.limits.cpu to 1 and scale the replica size.
Run a prediction¶
The first step is to determine the ingress IP and
                ports and set INGRESS_HOST and INGRESS_PORT
MODEL_NAME=pmml-demo
INPUT_PATH=@./pmml-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice pmml-demo -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
            Expected Output
* TCP_NODELAY set
* Connected to localhost (::1) port 8081 (#0)
> POST /v1/models/pmml-demo:predict HTTP/1.1
> Host: pmml-demo.default.example.com
> User-Agent: curl/7.64.1
> Accept: */*
> Content-Length: 45
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 45 out of 45 bytes
< HTTP/1.1 200 OK
< content-length: 39
< content-type: application/json; charset=UTF-8
< date: Sun, 18 Oct 2020 15:50:02 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 12
<
* Connection #0 to host localhost left intact
{"predictions": [{'Species': 'setosa', 'Probability_setosa': 1.0, 'Probability_versicolor': 0.0, 'Probability_virginica': 0.0, 'Node_Id': '2'}]}* Closing connection 0