Skip to content

Kubernetes Deployment Installation Guide

KServe supports RawDeployment mode to enable InferenceService deployment with Kubernetes resources Deployment, Service, Ingress and Horizontal Pod Autoscaler. Comparing to serverless deployment it unlocks Knative limitations such as mounting multiple volumes, on the other hand Scale down and from Zero is not supported in RawDeployment mode.

Kubernetes 1.28 is the minimally required version and please check the following recommended Istio versions for the corresponding Kubernetes version.

Kubernetes Version Recommended Istio Version
1.28 1.22
1.29 1.22, 1.23
1.30 1.22, 1.23

1. Install Ingress Controller

In this guide we choose to install Istio as ingress controller. The minimally required Istio version is 1.22 and you can refer to the Istio install guide.

Once Istio is installed, create IngressClass resource for istio.

apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
  name: istio
spec:
  controller: istio.io/ingress-controller

Note

Istio ingress is recommended, but you can choose to install with other Ingress controllers and create IngressClass resource for your Ingress option.

2. Install Cert Manager

The minimally required Cert Manager version is 1.15.0 and you can refer to Cert Manager installation guide.

Note

Cert manager is required to provision webhook certs for production grade installation, alternatively you can run self signed certs generation script.

3. Install KServe

Note

The default KServe deployment mode is Serverless which depends on Knative. The following step changes the default deployment mode to RawDeployment before installing KServe.

I. Install KServe CRDs

helm install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd --version v0.14.1

II. Install KServe Resources

Set the kserve.controller.deploymentMode to RawDeployment and kserve.controller.gateway.ingressGateway.className to point to the IngressClass name created in step 1.

helm install kserve oci://ghcr.io/kserve/charts/kserve --version v0.14.1 \
 --set kserve.controller.deploymentMode=RawDeployment \
 --set kserve.controller.gateway.ingressGateway.className=your-ingress-class

I. Install KServe: --server-side option is required as the InferenceService CRD is large, see this issue for details.

kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v0.14.1/kserve.yaml

II. Install KServe default serving runtimes:

kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v0.14.1/kserve-cluster-resources.yaml

III. Change default deployment mode and ingress option

First in the ConfigMap inferenceservice-config modify the defaultDeploymentMode from the deploy section to RawDeployment,

kubectl patch configmap/inferenceservice-config -n kserve --type=strategic -p '{"data": {"deploy": "{\"defaultDeploymentMode\": \"RawDeployment\"}"}}'

then modify the ingressClassName from ingress section to the IngressClass name created in step 1.

ingress: |-
{
    "ingressClassName" : "your-ingress-class",
}

Back to top