Configurations

KServe provides a comprehensive ConfigMap (inferenceservice-config) that allows administrators to configure global defaults for various components and behaviors. These configurations are applied at the cluster level and affect all InferenceServices unless overridden by per-service annotations.

The ConfigMap is located in the kserve namespace and contains configuration sections for different components including ingress, storage, deployment modes, resource limits, and more.

Many of these global configurations can be overridden at the individual Service level using specific annotations or labels.

Ingress Configuration

Controls network ingress settings for serving external traffic to InferenceServices.

Global (ConfigMap)

apiVersion: v1
kind: ConfigMap
metadata:
  name: inferenceservice-config
  namespace: kserve
data:
  ingress: |-
    {   
        "enableGatewayApi": false,
        "kserveIngressGateway": "kserve/kserve-ingress-gateway",
        "ingressGateway" : "knative-serving/knative-ingress-gateway",
        "localGateway" : "knative-serving/knative-local-gateway",
        "localGatewayService" : "knative-local-gateway.istio-system.svc.cluster.local",
        "ingressDomain"  : "example.com",
        "additionalIngressDomains": ["additional-example.com"],
        "ingressClassName" : "istio",
        "domainTemplate": "{{ .Name }}-{{ .Namespace }}.{{ .IngressDomain }}",
        "urlScheme": "http",
        "disableIstioVirtualHost": false,
        "disableIngressCreation": false,
        "pathTemplate": "/serving/{{ .Namespace }}/{{ .Name }}"
    }

Enable Gateway API

Specifies whether to use Gateway API instead of Ingress to serve external traffic. Only applies in raw deployment mode.

Global key: enableGatewayApi
Per-service annotation key: Not supported
Possible values: true, false
Default: false

KServe Ingress Gateway

The gateway resource used for external traffic in raw deployment mode.

Global key: kserveIngressGateway
Per-service annotation key: Not supported
Possible values: Gateway in format <namespace>/<name>
Default: "kserve/kserve-ingress-gateway"

Ingress Gateway

The gateway used for external traffic in serverless deployment with Istio.

Global key: ingressGateway
Per-service annotation key: Not supported
Possible values: Gateway in format <namespace>/<name>
Default: "knative-serving/knative-ingress-gateway"

Knative Local Gateway

Specifies the hostname of the Knative's local gateway service. The default KServe configurations are re-using the Istio local gateways for Knative. In this case, this knativeLocalGatewayService field can be left unset. When unset, the value of localGatewayService will be used.

However, sometimes it may be better to have local gateways specifically for KServe (e.g. when enabling strict mTLS in Istio). Under such setups where KServe is needed to have its own local gateways, the values of the localGateway and localGatewayService should point to the KServe local gateways. Then, this knativeLocalGatewayService field should point to the Knative's local gateway service.

This configuration only applicable for serverless deployment with Istio configured as network layer.

Global key: knativeLocalGatewayService
Per-service annotation key: Not supported
Possible values: Istio Gateway service cluster local hostname
Default: ""

Local Gateway

Specifies the gateway which handles the network traffic within the cluster. This configuration only applicable for serverless deployment with Istio configured as network layer.

Global key: localGateway
Per-service annotation key: Not supported
Possible values: Istio Gateway in format <namespace>/<name>
Default: "knative-serving/knative-local-gateway"

Local Gateway Service

The hostname of the local gateway service for cluster-internal traffic. This configuration only applicable for serverless deployment with Istio configured as network layer.

Global key: localGatewayService
Per-service annotation key: Not supported
Possible values: Istio Gateway service cluster local hostname
Default: "knative-local-gateway.istio-system.svc.cluster.local"

Ingress Domain

The domain name used for creating URLs in raw deployment mode. If ingressDomain is empty then "example.com" is used as default domain.

Global key: ingressDomain
Per-service annotation key: Not supported
Possible values: Valid domain name
Default: "example.com"

Additional Ingress Domains

Additional domain names for creating URLs.

Global key: additionalIngressDomains
Per-service annotation key: Not supported
Possible values: Array of domain names
Default: []

Ingress Class Name

The ingress controller to use for ingress traffic when gateway api is disabled. This configuration only applicable in raw deployment mode.

Global key: ingressClassName
Per-service annotation key: Not supported
Possible values: Valid ingress class name
Default: "istio"

Domain Template

Template for generating domain/URL for each InferenceService by combining variable from:

Name of the inference service {{ .Name }}
Namespace of the inference service {{ .Namespace }}
Annotation of the inference service {{ .Annotations.key }}
Label of the inference service {{ .Labels.key }}
IngressDomain {{ .IngressDomain }}

This configuration only applicable for raw deployment.

Global key: domainTemplate
Per-service annotation key: Not supported
Possible values: Go template string with variables: {{ .Name }}, {{ .Namespace }}, {{ .IngressDomain }}, {{ .Annotations.key }}, {{ .Labels.key }}
Default: "{{ .Name }}-{{ .Namespace }}.{{ .IngressDomain }}"

URL Scheme

The URL scheme to use for InferenceService and InferenceGraph. This configuration only applicable for raw deployment.

Global key: urlScheme
Per-service annotation key: Not supported
Possible values: "http", "https"
Default: "http"

Disable Istio Virtual Host

Controls whether to use Istio as the network layer for serverless deployment. By default istio is used as the network layer. When DisableIstioVirtualHost is true, KServe does not create the top level virtual service thus Istio is no longer required for serverless mode. By setting this field to true, user can use other networking layers supported by knative. For more info https://github.com/kserve/kserve/pull/2380, https://kserve.github.io/website/master/admin/serverless/kourier_networking/.

Global key: disableIstioVirtualHost
Per-service annotation key: Not supported
Possible values: true, false
Default: false

Disable Ingress Creation

Controls whether to disable ingress creation for raw deployment mode.

Global key: disableIngressCreation
Possible values: true, false
Default: false
Per-service label key: "networking.kserve.io/visibility"
Possible label values: ""cluster-local""

Path Template

Template for generating path-based URLs for serverless deployment. The following variables can be used in the template for generating url.

Name of the inference service {{ .Name}}
Namespace of the inference service {{ .Namespace }}

Empty string means no path based URL is generated. For more info https://github.com/kserve/kserve/issues/2257.

Global key: pathTemplate
Per-service annotation key: Not supported
Possible values: Go template string with variables: {{ .Name }}, {{ .Namespace }}
Default: ""

Storage Initializer Configuration

Configures the storage initializer component responsible for downloading models from various storage backends.

Global (ConfigMap)

apiVersion: v1
kind: ConfigMap
metadata:
  name: inferenceservice-config
  namespace: kserve
data:
  storageInitializer: |-
    {
        "image" : "kserve/storage-initializer:latest",
        "memoryRequest": "100Mi",
        "memoryLimit": "1Gi",
        "cpuRequest": "100m",
        "cpuLimit": "1",
        "caBundleConfigMapName": "",
        "caBundleVolumeMountPath": "/etc/ssl/custom-certs",
        "enableModelcar": true,
        "cpuModelcar": "10m",
        "memoryModelcar": "15Mi",
        "uidModelcar": 1010
    }

Storage Initializer Image

The container image used for the storage initializer init container. This will be overridden by the image field in the ClusterStorageContainer if the resource is available.

Global key: image
Per-service annotation key: Not supported
Possible values: Valid container image URI
Default: "kserve/storage-initializer:v<ActiveDocsVersion />.0"

Memory Request

The memory request for the storage initializer init container. This will be overridden by the ClusterStorageContainer if the resource is available.

Global key: memoryRequest
Per-service annotation key: Not supported
Possible values: Kubernetes resource quantity
Default: "100Mi"

Memory Limit

The memory limit for the storage initializer init container. This will be overridden by the ClusterStorageContainer if the resource is available.

Global key: memoryLimit
Per-service annotation key: Not supported
Possible values: Kubernetes resource quantity
Default: "1Gi"

CPU Request

The CPU request for the storage initializer init container. This will be overridden by the ClusterStorageContainer if the resource is available.

Global key: cpuRequest
Per-service annotation key: Not supported
Possible values: Kubernetes resource quantity
Default: "100m"

CPU Limit

The CPU limit for the storage initializer init container. This will be overridden by the ClusterStorageContainer if the resource is available.

Global key: cpuLimit
Per-service annotation key: Not supported
Possible values: Kubernetes resource quantity
Default: "1"

CA Bundle ConfigMap Name

The ConfigMap containing CA certificates to be copied to storage initializer.

Global key: caBundleConfigMapName
Per-service annotation key: Not supported
Possible values: ConfigMap name or empty string
Default: ""

CA Bundle Volume Mount Path

The mount point for the CA bundle ConfigMap in the storage initializer container.

Global key: caBundleVolumeMountPath
Per-service annotation key: Not supported
Possible values: Valid filesystem path
Default: "/etc/ssl/custom-certs"

Mount PVC Volume with Read-Write

Controls whether the PVC volume is mounted with read-write permissions. If set to false, the PVC volume will be mounted as read-write, allowing write operations on the model storage. For more information, see the PVC model storage documentation.

Global key: Not supported
Per-service annotation key: storage.kserve.io/readonly
Possible values: true, false
Default: true

Enable Modelcar

Controls whether to enable support for downloading models from OCI registries using the oci:// URI schema. When enabled, you can specify model storage paths as OCI image references. For more information, see OCI model storage documentation.

Global key: enableModelcar
Per-service annotation key: Not supported
Possible values: true, false
Default: true

CPU Modelcar

The CPU request and limit for the passive modelcar container.

Global key: cpuModelcar
Per-service annotation key: Not supported
Possible values: Kubernetes resource quantity
Default: "10m"

Memory Modelcar

The memory request and limit for the passive modelcar container.

Global key: memoryModelcar
Per-service annotation key: Not supported
Possible values: Kubernetes resource quantity
Default: "15Mi"

UID Modelcar

The user ID under which the modelcar process and main container run.

Global key: uidModelcar
Per-service annotation key: Not supported
Possible values: Valid UID number
Default: 1010

Storage Initializer UID

The user ID under which the storage initializer init container runs. This is useful for the case where ISTIO CNI with DNS proxy is enabled. See for more details: https://istio.io/latest/docs/setup/additional-setup/cni/#compatibility-with-application-init-containers.

Global key: Not supported
Per-service annotation key: serving.kserve.io/storage-initializer-uid
Possible values: Valid UID number
Default: 1000

Credentials Configuration

Configures authentication and storage credentials for downloading models from cloud storage.

Global (ConfigMap)

apiVersion: v1
kind: ConfigMap
metadata:
  name: inferenceservice-config
  namespace: kserve
data:
  credentials: |-
    {
       "storageSpecSecretName": "storage-config",
       "storageSecretNameAnnotation": "serving.kserve.io/storageSecretName",
       "gcs": {
           "gcsCredentialFileName": "gcloud-application-credentials.json"
       },
       "s3": {
           "s3AccessKeyIDName": "AWS_ACCESS_KEY_ID",
           "s3SecretAccessKeyName": "AWS_SECRET_ACCESS_KEY",
           "s3Endpoint": "",
           "s3UseHttps": "",
           "s3Region": "",
           "s3VerifySSL": "",
           "s3UseVirtualBucket": "",
           "s3UseAccelerate": "",
           "s3UseAnonymousCredential": "",
           "s3CABundle": ""
       }
    }

Storage Spec Secret Name

The default secret name containing credentials for downloading models when using storageSpec in InferenceService.

Global key: storageSpecSecretName
Per-service annotation key: serving.kserve.io/storageSecretName
Possible values: Valid Kubernetes secret name
Default: "storage-config"

Storage Secret Name Annotation

The annotation key used to specify a custom secret name for storage credentials.

When using storageUri the order of the precedence is: secret name reference annotation > secret name references from service account.

When using storageSpec the order of the precedence is: secret name reference annotation > storageSpecSecretName in configmap

Global key: storageSecretNameAnnotation
Per-service annotation key: N/A (this defines the annotation key itself)
Possible values: Valid annotation key
Default: "serving.kserve.io/storageSecretName"

GCS Configurations

GCS Credential File Name

The filename of the GCS credential file within the secret.

Global key: gcs.gcsCredentialFileName
Per-service annotation key: Not supported
Possible values: Valid filename
Default: "gcloud-application-credentials.json"

S3 Configurations

The global S3 configuration can be overridden by specifying the annotations on service account or static secret. For more details, see S3 model storage documentation.

For a quick reference about AWS ENV variables:

The s3AccessKeyIDName and s3SecretAccessKeyName fields are only used when static credentials (IAM User Access Key Secret) are used as the authentication method for AWS S3. The rest of the fields are used in both authentication methods (IAM Role for Service Account & IAM User Access Key Secret) if a non-empty value is provided.

S3 Access Key ID Name

The environment variable name for S3 access key ID.

Global key: s3.s3AccessKeyIDName
Per-service annotation key: Not supported
Possible values: Valid environment variable name
Default: "AWS_ACCESS_KEY_ID"

S3 Secret Access Key Name

The environment variable name for S3 secret access key.

Global key: s3.s3SecretAccessKeyName
Per-service annotation key: Not supported
Possible values: Valid environment variable name
Default: "AWS_SECRET_ACCESS_KEY"