Control Plane API

Refer to Kubernetes API documentation for fields of "metadata".

specrequired

ServingRuntimeSpec

statusrequired

ServingRuntimeStatus

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a ClusterServingRuntimeList resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

itemsrequired

ClusterServingRuntime array

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a ClusterStorageContainer resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

specrequired

StorageContainerSpec

disabledoptional

boolean

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a ClusterStorageContainerList resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

itemsrequired

ClusterStorageContainer array

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a InferenceGraph resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

specrequired

InferenceGraphSpec

statusrequired

InferenceGraphStatus

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a InferenceGraphList resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

itemsrequired

InferenceGraph array

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a LLMInferenceService resource

metadatarequired

LLMInferenceServiceStatus

Refer to Kubernetes API documentation for fields of "metadata".

specrequired

LLMInferenceServiceSpec

statusrequired

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a LLMInferenceServiceConfig resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

specrequired

LLMInferenceServiceSpec

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a LLMInferenceServiceConfigList resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

itemsrequired

LLMInferenceServiceConfig array

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a LLMInferenceServiceList resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

itemsrequired

LLMInferenceService array

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a LocalModelCache resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

specrequired

LocalModelCacheSpec

statusrequired

LocalModelCacheStatus

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a LocalModelCacheList resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

itemsrequired

LocalModelCache array

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a LocalModelNode resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

specrequired

LocalModelNodeSpec

statusrequired

LocalModelNodeStatus

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a LocalModelNodeGroup resource

metadatarequired

LocalModelNodeGroupStatus

Refer to Kubernetes API documentation for fields of "metadata".

specrequired

LocalModelNodeGroupSpec

statusrequired

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a LocalModelNodeGroupList resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

itemsrequired

LocalModelNodeGroup array

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a LocalModelNodeList resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

itemsrequired

LocalModelNode array

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a ServingRuntime resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

specrequired

ServingRuntimeSpec

statusrequired

ServingRuntimeStatus

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a ServingRuntimeList resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

itemsrequired

ServingRuntime array

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a TrainedModel resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

specrequired

TrainedModelSpec

statusrequired

TrainedModelStatus

apiVersionrequired

String

We are on version serving.kserve.io/v1alpha1 of the API.

kindrequired

String

This is a TrainedModelList resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

itemsrequired

TrainedModel array

serverTyperequired

ServerType

ServerType must be one of the supported built-in types such as "triton" or "mlserver",
and the runtime's container must have the same name

runtimeManagementPortrequired

integer

Port which the runtime server listens for model management requests

memBufferBytesrequired

integer

Fixed memory overhead to subtract from runtime container's memory allocation to determine model capacity

modelLoadingTimeoutMillisrequired

integer

Timeout for model loading operations in milliseconds

envrequired

EnvVar array

Environment variables used to control other aspects of the built-in adapter's behaviour (uncommon)

httpoptional

HTTPRouteSpec

HTTP route configuration.

refsoptional

UntypedObjectReference array

Refs provides references to existing, user-managed Gateway objects ("Bring Your Own" gateway).
The controller will use the specified Gateway instead of creating one.

refsoptional

Refs provides references to existing, user-managed HTTPRoute objects ("Bring Your Own" route).
The controller will validate the existence of these routes but will not modify them.

specoptional

HTTPRouteSpec

Spec allows for providing a custom specification for an HTTPRoute.
If provided, the controller will create and manage an HTTPRoute with this specification.

serverReadoptional

integer

ServerRead specifies the number of seconds to wait before timing out a request read by the server.

serverWriteoptional

integer

ServerWrite specifies the maximum duration in seconds before timing out writes of the response.

serverIdleoptional

integer

ServerIdle specifies the maximum amount of time in seconds to wait for the next request when keep-alives are enabled.

serviceClientoptional

integer

ServiceClient specifies a time limit in seconds for requests made to the graph components by HTTP client.

nodesrequired

object (keys:string, values:InferenceRouter)

Map of InferenceGraph router nodes
Each node defines the router which can be different routing types

resourcesoptional

affinityoptional

InfereceGraphRouterTimeouts

timeoutoptional

integer

TimeoutSeconds specifies the number of seconds to wait before timing out a request to the component.

routerTimeoutsoptional

minReplicasoptional

integer

Minimum number of replicas, defaults to 1 but can be set to 0 to enable scale-to-zero.

maxReplicasoptional

integer

Maximum number of replicas for autoscaling.

scaleTargetoptional

integer

ScaleTarget specifies the integer target value of the metric type the Autoscaler watches for.
concurrency and rps targets are supported by Knative Pod Autoscaler
(https://knative.dev/docs/serving/autoscaling/autoscaling-targets/).

scaleMetricoptional

ScaleMetric defines the scaling metric type watched by autoscaler
possible values are concurrency, rps, cpu, memory. concurrency, rps are supported via
Knative Pod Autoscaler(https://knative.dev/docs/serving/autoscaling/autoscaling-metrics).

tolerationsoptional

Toleration array

Toleration specifies the toleration for the InferenceGraph.
https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/

nodeSelectoroptional

object (keys:string, values:string)

NodeSelector specifies the node selector for the InferenceGraph.
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/

nodeNameoptional

string

NodeName specifies the node name for the InferenceGraph.
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/

serviceAccountNameoptional

string

ServiceAccountName specifies the service account name for the InferenceGraph.
https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/

observedGenerationoptional

integer

ObservedGeneration is the 'Generation' of the Service that
was last processed by the controller.

conditionsoptional

Conditions the latest available observations of a resource's current state.

annotationsrequired

object (keys:string, values:string)

Annotations is additional Status fields for the Resource to save some
additional State as well as convey more information to the user. This is
roughly akin to Annotations on any k8s resource, just the reconciler conveying
richer information outwards.

urloptional

InferenceStepDependencyType

Url for the InferenceGraph

deploymentModerequired

string

InferenceGraph DeploymentMode

specoptional

InferencePoolSpec

Spec defines an inline InferencePool specification.

refoptional

LocalObjectReference

Ref is a reference to an existing InferencePool.

routerTyperequired

InferenceRouterType

RouterType

- Sequence: chain multiple inference steps with input/output from previous step

- Splitter: randomly routes to the target service according to the weight

- Ensemble: routes the request to multiple models and then merge the responses

- Switch: routes the request to one of the steps based on condition

stepsoptional

InferenceStep array

Steps defines destinations for the current router node

Sequence

Sequence Default type only route to one destination

Splitter

Splitter router randomly routes the requests to the named service according to the weight

Ensemble

Ensemble router routes the requests to multiple models and then merge the responses

Switch

Switch routes the request to the model based on certain condition

nameoptional

string

Unique name for the step within this node

nodeNameoptional

string

The node name for routing as next step

serviceNamerequired

string

named reference for InferenceService

serviceUrloptional

string

InferenceService URL, mutually exclusive with ServiceName

dataoptional

string

request data sent to the next route with input/output from the previous step
$request
$response.predictions

weightoptional

integer

the weight for split of the traffic, only used for Split Router
when weight is specified all the routing targets should be sum to 100

conditionoptional

string

routing based on the condition

dependencyoptional

to decide whether a step is a hard or a soft dependency in the Inference Graph

Soft

Soft

Hard

Hard

nodeNameoptional

string

The node name for routing as next step

serviceNamerequired

string

named reference for InferenceService

serviceUrloptional

string

InferenceService URL, mutually exclusive with ServiceName

refsoptional

UntypedObjectReference array

Refs provides a reference to an existing, user-managed Ingress object ("Bring Your Own" ingress).
The controller will not create an Ingress but will use the referenced one to populate status URLs.

modelrequired

LLMModelSpec

Model specification, including its URI, potential LoRA adapters, and storage details.

replicasoptional

integer

Number of replicas for the deployment.

parallelismoptional

ParallelismSpec

Parallelism configurations for the runtime, such as tensor and pipeline parallelism.
These values are used to configure the underlying inference runtime (e.g., vLLM).

templateoptional

Template for the main pod spec.
In a multi-node deployment, this configures the "head" or "master" pod.
In a disaggregated deployment, this configures the "decode" pod if it's the top-level template,
or the "prefill" pod if it's within the Prefill block.

workeroptional

Worker configuration for multi-node deployments.
The presence of this field triggers the creation of a multi-node (distributed) setup.
This spec defines the configuration for the worker pods, while the main 'Template' field defines the head pod.
The controller is responsible for enabling discovery between head and worker pods.

routeroptional

RouterSpec

Router configuration for how the service is exposed. This section dictates the creation and management
of networking resources like Ingress or Gateway API objects (HTTPRoute, Gateway).

prefilloptional

WorkloadSpec

Prefill configuration for disaggregated serving.
When this section is included, the controller creates a separate deployment for prompt processing (prefill)
in addition to the main 'decode' deployment, inspired by the llm-d architecture.
This allows for independent scaling and hardware allocation for prefill and decode steps.

baseRefsoptional

BaseRefs allows inheriting and overriding configurations from one or more LLMInferenceServiceConfig instances.
The controller merges these base configurations, with the current LLMInferenceService spec taking the highest precedence.
When multiple baseRefs are provided, the last one in the list overrides previous ones.

urloptional

URL of the publicly exposed service.

observedGenerationoptional

integer

ObservedGeneration is the 'Generation' of the Service that
was last processed by the controller.

conditionsoptional

Conditions the latest available observations of a resource's current state.

annotationsrequired

object (keys:string, values:string)

Annotations is additional Status fields for the Resource to save some
additional State as well as convey more information to the user. This is
roughly akin to Annotations on any k8s resource, just the reconciler conveying
richer information outwards.

addressoptional

Address is a single Addressable address.
If Addresses is present, Address will be ignored by clients.

addressesoptional

Addressable array

Addresses is a list of addresses for different protocols (HTTP and HTTPS)
If Addresses is present, Address must be ignored by clients.

urirequired

URI of the model, specifying its location, e.g., hf://meta-llama/Llama-4-Scout-17B-16E-Instruct
The storage-initializer init container uses this URI to download the model.

nameoptional

string

Name is the name of the model as it will be set in the "model" parameter for an incoming request.
If omitted, it will default to metadata.name. For LoRA adapters, this field is required.

criticalityoptional

Criticality

Criticality defines how important it is to serve the model compared to other models.
This is used by the Inference Gateway scheduler.

loraoptional

LoRASpec

LoRA (Low-Rank Adaptation) adapters configurations.
Allows for specifying one or more LoRA adapters to be applied to the base model.

storagerequired

LLMStorageSpec

Storage specification for the model, such as path and credentials.
This is used by the storage-initializer to correctly download the model from the specified URI.

pathoptional

string

The path to the model object in the storage. It cannot co-exist
with the storageURI.

parametersoptional

map[string]string

Parameters to override the default storage credentials and config.

keyoptional

string

The Storage Key in the secret for this model.

adaptersoptional

ModelSpec array

Adapters is the static specification for one or more LoRA adapters.
Each adapter is defined by its own ModelSpec.

sourceModelUrirequired

string

Original StorageUri

modelSizerequired

Model size to make sure it does not exceed the disk space reserved for local models. The limit is defined on the NodeGroup.

nodeGroupsrequired

string array

group of nodes to cache the model on.
Todo: support more than 1 node groups

nodeStatusrequired

object (keys:string, values:NodeStatus)

Status of the model on a node, like NodeDownloaded or NodeNotReady

copiesoptional

ModelCopies

How many nodes have the model available locally

inferenceServicesrequired

NamespacedName array

Inference services using this local model

sourceModelUrirequired

string

Original StorageUri

modelNamerequired

string

Model name. Used as the subdirectory name to store this model on local file system

storageLimitrequired

PersistentVolumeClaimSpec

Max storage size per node in this node group

persistentVolumeSpecrequired

PersistentVolumeSpec

Used to create PersistentVolumes for downloading models and in inference service namespaces

persistentVolumeClaimSpecrequired

Used to create PersistentVolumeClaims for download and in inference service namespaces

usedrequired

Used storage space on any node for this node group

availablerequired

Available storage space on any node for this node group

localModelsrequired

LocalModelInfo array

List of model source URI and their names

modelStatusrequired

object (keys:string, values:ModelStatus)

Status of each local model

availablerequired

integer

totalrequired

integer

Total number of nodes that we expect the model to be downloaded. Including nodes that are not ready

failedrequired

integer

Download Failed

storageUrirequired

string

Storage URI for the model repository

frameworkrequired

string

Machine Learning
The values could be: "tensorflow","pytorch","sklearn","onnx","xgboost", "myawesomeinternalframework" etc.

memoryrequired

Maximum memory this model will consume, this field is used to decide if a model server has enough memory to load this model.

ModelDownloadPending

ModelDownloading

ModelDownloaded

ModelDownloadError

namespacerequired

string

namerequired

string

NodeNotReady

NodeDownloadPending

NodeDownloading

NodeDownloaded

NodeDownloadError

tensoroptional

integer

Tensor parallelism size.

pipelineoptional

integer

Pipeline parallelism size.

routeoptional

GatewayRoutesSpec

Route configuration for the Gateway API.
If an empty object \{\} is provided, the controller creates and manages a new HTTPRoute.

gatewayoptional

GatewaySpec

Gateway configuration for the Gateway API, mutually exclusive with Ingress.
If an empty object \{\} is provided, the controller uses a default Gateway.
This must be used in conjunction with the 'Route' field for managed Gateway API resources.

ingressoptional

IngressSpec

Ingress configuration. This is mutually exclusive with Route and Gateway.
If an empty object \{\} is provided, the controller creates and manages a default Ingress resource.

scheduleroptional

SchedulerSpec

Scheduler configuration for the Inference Gateway extension.
If this field is non-empty, an InferenceModel resource will be created to integrate with the gateway's scheduler.

pooloptional

InferencePoolSpec

Pool configuration for the InferencePool, which is part of the Inference Gateway extension.

templateoptional

Template for the Inference Gateway Extension pod spec.
This configures the Endpoint Picker (EPP) Deployment.

triton

Model server is Triton

mlserver

Model server is MLServer

ovms

Model server is OpenVino Model Server

containersrequired

Container array

List of containers belonging to the pod.
Containers cannot currently be added or removed.
There must be at least one container in a Pod.
Cannot be updated.

volumesoptional

Volume array

List of volumes that can be mounted by containers belonging to the pod.
More info: https://kubernetes.io/docs/concepts/storage/volumes

nodeSelectoroptional

object (keys:string, values:string)

NodeSelector is a selector which must be true for the pod to fit on a node.
Selector which must match a node's labels for the pod to be scheduled on that node.
More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

affinityoptional

If specified, the pod's scheduling constraints

tolerationsoptional

Toleration array

If specified, the pod's tolerations.

labelsoptional

object (keys:string, values:string)

Labels that will be add to the pod.
More info: http://kubernetes.io/docs/user-guide/labels

annotationsoptional

object (keys:string, values:string)

Annotations that will be add to the pod.
More info: http://kubernetes.io/docs/user-guide/annotations

imagePullSecretsoptional

ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec.
If specified, these secrets will be passed to individual puller implementations for them to use. For example,
in the case of docker, only DockerConfig type secrets are honored.
More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod

hostIPCoptional

boolean

Use the host's ipc namespace.
Optional: Default to false.

supportedModelFormatsrequired

SupportedModelFormat array

Model formats and version supported by this runtime

multiModeloptional

boolean

Whether this ServingRuntime is intended for multi-model usage or not.

disabledoptional

boolean

Set to true to disable use of this runtime

protocolVersionsoptional

InferenceServiceProtocol array

Supported protocol versions (i.e. v1 or v2 or grpc-v1 or grpc-v2)

workerSpecoptional

WorkerSpec

Set WorkerSpec to enable multi-node/multi-gpu

containersrequired

Container array

List of containers belonging to the pod.
Containers cannot currently be added or removed.
There must be at least one container in a Pod.
Cannot be updated.

volumesoptional

Volume array

List of volumes that can be mounted by containers belonging to the pod.
More info: https://kubernetes.io/docs/concepts/storage/volumes

nodeSelectoroptional

object (keys:string, values:string)

NodeSelector is a selector which must be true for the pod to fit on a node.
Selector which must match a node's labels for the pod to be scheduled on that node.
More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

affinityoptional

If specified, the pod's scheduling constraints

tolerationsoptional

Toleration array

If specified, the pod's tolerations.

labelsoptional

object (keys:string, values:string)

Labels that will be add to the pod.
More info: http://kubernetes.io/docs/user-guide/labels

annotationsoptional

object (keys:string, values:string)

Annotations that will be add to the pod.
More info: http://kubernetes.io/docs/user-guide/annotations

imagePullSecretsoptional

ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec.
If specified, these secrets will be passed to individual puller implementations for them to use. For example,
in the case of docker, only DockerConfig type secrets are honored.
More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod

hostIPCoptional

boolean

Use the host's ipc namespace.
Optional: Default to false.

grpcEndpointoptional

string

Grpc endpoint for internal model-management (implementing mmesh.ModelRuntime gRPC service)
Assumed to be single-model runtime if omitted

grpcDataEndpointoptional

string

Grpc endpoint for inferencing

httpDataEndpointoptional

string

HTTP endpoint for inferencing

replicasoptional

integer

Configure the number of replicas in the Deployment generated by this ServingRuntime
If specified, this overrides the podsPerRuntime configuration value

storageHelperoptional

StorageHelper

Configuration for this runtime's use of the storage helper (model puller)
It is enabled unless explicitly disabled

builtInAdapteroptional

BuiltInAdapter

Provide the details about built-in runtime adapter

containerrequired

Container

Container spec for the storage initializer init container

supportedUriFormatsrequired

SupportedUriFormat array

List of URI formats that this container supports

workloadTyperequired

WorkloadType

initContainer

disabledoptional

boolean

namerequired

string

Name of the model format.

versionoptional

string

Version of the model format.
Used in validating that a predictor is supported by a runtime.
Can be "major", "major.minor" or "major.minor.patch".

autoSelectoptional

boolean

Set to true to allow the ServingRuntime to be used for automatic model placement if
this model format is specified with no explicit runtime.

priorityoptional

integer

Priority of this serving runtime for auto selection.
This is used to select the serving runtime if more than one serving runtime supports the same model format.
The value should be greater than zero. The higher the value, the higher the priority.
Priority is not considered if AutoSelect is either false or not specified.
Priority can be overridden by specifying the runtime in the InferenceService.

prefixrequired

string

regexrequired

string

inferenceServicerequired

string

parent inference service to deploy to

modelrequired

ModelSpec

Predictor model spec

observedGenerationoptional

integer

ObservedGeneration is the 'Generation' of the Service that
was last processed by the controller.

conditionsoptional

Conditions the latest available observations of a resource's current state.

annotationsrequired

object (keys:string, values:string)

Annotations is additional Status fields for the Resource to save some
additional State as well as convey more information to the user. This is
roughly akin to Annotations on any k8s resource, just the reconciler conveying
richer information outwards.

urlrequired

URL holds the url that will distribute traffic over the provided traffic targets.
For v1: http[s]://\{route-name\}.\{route-namespace\}.\{cluster-level-suffix\}/v1/models/:predict
For v2: http[s]://\{route-name\}.\{route-namespace\}.\{cluster-level-suffix\}/v2/models//infer

addressrequired

Addressable endpoint for the deployed trained model
http:///v1/models/.metadata.name

namerequired

ObjectName

Name of the referenced object.

namespacerequired

Namespace

Namespace of the referenced object.

containersrequired

Container array

List of containers belonging to the pod.
Containers cannot currently be added or removed.
There must be at least one container in a Pod.
Cannot be updated.

volumesoptional

Volume array

List of volumes that can be mounted by containers belonging to the pod.
More info: https://kubernetes.io/docs/concepts/storage/volumes

nodeSelectoroptional

object (keys:string, values:string)

NodeSelector is a selector which must be true for the pod to fit on a node.
Selector which must match a node's labels for the pod to be scheduled on that node.
More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

affinityoptional

If specified, the pod's scheduling constraints

tolerationsoptional

Toleration array

If specified, the pod's tolerations.

labelsoptional

object (keys:string, values:string)

Labels that will be add to the pod.
More info: http://kubernetes.io/docs/user-guide/labels

annotationsoptional

object (keys:string, values:string)

Annotations that will be add to the pod.
More info: http://kubernetes.io/docs/user-guide/annotations

imagePullSecretsoptional

ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec.
If specified, these secrets will be passed to individual puller implementations for them to use. For example,
in the case of docker, only DockerConfig type secrets are honored.
More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod

hostIPCoptional

boolean

Use the host's ipc namespace.
Optional: Default to false.

pipelineParallelSizeoptional

integer

PipelineParallelSize defines the number of parallel workers.
It specifies the number of model partitions across multiple devices, allowing large models to be split and processed concurrently across these partitions
It also represents the number of replicas in the worker set, where each worker set serves as a scaling unit.

tensorParallelSizeoptional

integer

TensorParallelSize specifies the number of GPUs to be used per node.
It indicates the degree of parallelism for tensor computations across the available GPUs.

replicasoptional

integer

Number of replicas for the deployment.

parallelismoptional

ParallelismSpec

Parallelism configurations for the runtime, such as tensor and pipeline parallelism.
These values are used to configure the underlying inference runtime (e.g., vLLM).

templateoptional

Template for the main pod spec.
In a multi-node deployment, this configures the "head" or "master" pod.
In a disaggregated deployment, this configures the "decode" pod if it's the top-level template,
or the "prefill" pod if it's within the Prefill block.

workeroptional

Worker configuration for multi-node deployments.
The presence of this field triggers the creation of a multi-node (distributed) setup.
This spec defines the configuration for the worker pods, while the main 'Template' field defines the head pod.
The controller is responsible for enabling discovery between head and worker pods.

initContainer

localModelDownloadJob

apiVersionrequired

String

We are on version serving.kserve.io/v1beta1 of the API.

kindrequired

String

This is a InferenceService resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

specrequired

InferenceServiceSpec

statusrequired

InferenceServiceStatus

apiVersionrequired

String

We are on version serving.kserve.io/v1beta1 of the API.

kindrequired

String

This is a InferenceServiceList resource

metadatarequired

Refer to Kubernetes API documentation for fields of "metadata".

itemsrequired

InferenceService array

typerequired

ARTExplainerType

The type of ART explainer

storageUrirequired

string

The location of a trained explanation model

runtimeVersionrequired

string

Defaults to latest Explainer Version

configrequired

object (keys:string, values:string)

Inline custom parameter settings for explainer

storageoptional

Storage Spec for model location

SquareAttack

namerequired

string

name is the name of the authentication secret

metricsrequired

MetricsSpec array

metrics is a list of metrics spec to be used for autoscaling

maxBatchSizeoptional

integer

Specifies the max number of requests to trigger a batch

maxLatencyoptional

integer

Specifies the max latency to trigger a batch

timeoutoptional

integer

Specifies the timeout of a batch

minReplicasoptional

integer

Minimum number of replicas, defaults to 1 but can be set to 0 to enable scale-to-zero.

maxReplicasoptional

integer

Maximum number of replicas for autoscaling.

scaleTargetoptional

integer

ScaleTarget specifies the integer target value of the metric type the Autoscaler watches for.
concurrency and rps targets are supported by Knative Pod Autoscaler
(https://knative.dev/docs/serving/autoscaling/autoscaling-targets/).

scaleMetricoptional

ScaleMetric defines the scaling metric type watched by autoscaler.
possible values are concurrency, rps, cpu, memory. concurrency, rps are supported via
Knative Pod Autoscaler(https://knative.dev/docs/serving/autoscaling/autoscaling-metrics).

scaleMetricTypeoptional

Type of metric to use. Options are Utilization, or AverageValue.

autoScalingoptional

AutoScaling autoscaling spec which is backed up HPA or KEDA.

containerConcurrencyoptional

integer

ContainerConcurrency specifies how many requests can be processed concurrently, this sets the hard limit of the container
concurrency(https://knative.dev/docs/serving/autoscaling/concurrency).

timeoutoptional

integer

TimeoutSeconds specifies the number of seconds to wait before timing out a request to the component.

canaryTrafficPercentoptional

integer

CanaryTrafficPercent defines the traffic split percentage between the candidate revision and the last ready revision

loggeroptional

Activate request/response logging and logger configurations

batcheroptional

Activate request batching and batching configurations

labelsoptional

object (keys:string, values:string)

Labels that will be added to the component pod.
More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/

annotationsoptional

object (keys:string, values:string)

Annotations that will be added to the component pod.
More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/

deploymentStrategyoptional

The deployment strategy to use to replace existing pods with new ones. Only applicable for raw deployment mode.

latestReadyRevisionoptional

string

Latest revision name that is in ready state

latestCreatedRevisionoptional

string

Latest revision name that is created

previousRolledoutRevisionoptional

string

Previous revision name that is rolled out with 100 percent traffic

latestRolledoutRevisionoptional

string

Latest revision name that is rolled out with 100 percent traffic

trafficoptional

TrafficTarget array

Traffic holds the configured traffic distribution for latest ready revision and previous rolled out revision.

urloptional

URL holds the primary url that will distribute traffic over the provided traffic targets.
This will be one the REST or gRPC endpoints that are available.
It generally has the form http[s]://\{route-name\}.\{route-namespace\}.\{cluster-level-suffix\}

restUrloptional

REST endpoint of the component if available.

grpcUrloptional

gRPC endpoint of the component if available.

addressoptional

Addressable endpoint for the InferenceService

predictor

explainer

transformer

imagerequired

string

explainer docker image name

defaultImageVersionrequired

string

default explainer docker image version

storageUrirequired

string

The location of a trained explanation model

runtimeVersionrequired

string

Defaults to latest Explainer Version

configrequired

object (keys:string, values:string)

Inline custom parameter settings for explainer

storageoptional

Storage Spec for model location

artrequired

ARTExplainerSpec

Spec for ART explainer

volumesoptional

Volume array

List of volumes that can be mounted by containers belonging to the pod.
More info: https://kubernetes.io/docs/concepts/storage/volumes

initContainersrequired

Container array

List of initialization containers belonging to the pod.
Init containers are executed in order prior to containers being started. If any
init container fails, the pod is considered to have failed and is handled according
to its restartPolicy. The name for an init container or normal container must be
unique among all containers.
Init containers may not have Lifecycle actions, Readiness probes, Liveness probes, or Startup probes.
The resourceRequirements of an init container are taken into account during scheduling
by finding the highest request/limit for each resource type, and then using the max of
that value or the sum of the normal containers. Limits are applied to init containers
in a similar fashion.
Init containers cannot currently be added or removed.
Cannot be updated.
More info: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

containersrequired

Container array

List of containers belonging to the pod.
Containers cannot currently be added or removed.
There must be at least one container in a Pod.
Cannot be updated.

ephemeralContainersoptional

List of ephemeral containers run in this pod. Ephemeral containers may be run in an existing
pod to perform user-initiated actions such as debugging. This list cannot be specified when
creating a pod, and it cannot be modified by updating the pod spec. In order to add an
ephemeral container to an existing pod, use the pod's ephemeralcontainers subresource.

restartPolicyoptional

Restart policy for all containers within the pod.
One of Always, OnFailure, Never. In some contexts, only a subset of those values may be permitted.
Default to Always.
More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy

terminationGracePeriodSecondsoptional

integer

Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request.
Value must be non-negative integer. The value zero indicates stop immediately via
the kill signal (no opportunity to shut down).
If this value is nil, the default grace period will be used instead.
The grace period is the duration in seconds after the processes running in the pod are sent
a termination signal and the time when the processes are forcibly halted with a kill signal.
Set this value longer than the expected cleanup time for your process.
Defaults to 30 seconds.

activeDeadlineSecondsoptional

integer

Optional duration in seconds the pod may be active on the node relative to
StartTime before the system will actively try to mark it failed and kill associated containers.
Value must be a positive integer.

dnsPolicyoptional

Set DNS policy for the pod.
Defaults to "ClusterFirst".
Valid values are 'ClusterFirstWithHostNet', 'ClusterFirst', 'Default' or 'None'.
DNS parameters given in DNSConfig will be merged with the policy selected with DNSPolicy.
To have DNS options set along with hostNetwork, you have to specify DNS policy
explicitly to 'ClusterFirstWithHostNet'.

nodeSelectoroptional

object (keys:string, values:string)

NodeSelector is a selector which must be true for the pod to fit on a node.
Selector which must match a node's labels for the pod to be scheduled on that node.
More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

serviceAccountNameoptional

string

ServiceAccountName is the name of the ServiceAccount to use to run this pod.
More info: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/

serviceAccountoptional

string

DeprecatedServiceAccount is a deprecated alias for ServiceAccountName.
Deprecated: Use serviceAccountName instead.

automountServiceAccountTokenoptional

boolean

AutomountServiceAccountToken indicates whether a service account token should be automatically mounted.

nodeNameoptional

string

NodeName indicates in which node this pod is scheduled.
If empty, this pod is a candidate for scheduling by the scheduler defined in schedulerName.
Once this field is set, the kubelet for this node becomes responsible for the lifecycle of this pod.
This field should not be used to express a desire for the pod to be scheduled on a specific node.
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodename

hostNetworkoptional

boolean

Host networking requested for this pod. Use the host's network namespace.
If this option is set, the ports that will be used must be specified.
Default to false.

hostPIDoptional

boolean

Use the host's pid namespace.
Optional: Default to false.

hostIPCoptional

boolean

Use the host's ipc namespace.
Optional: Default to false.

shareProcessNamespaceoptional

boolean

Share a single process namespace between all of the containers in a pod.
When this is set containers will be able to view and signal processes from other containers
in the same pod, and the first process in each container will not be assigned PID 1.
HostPID and ShareProcessNamespace cannot both be set.
Optional: Default to false.

securityContextoptional

SecurityContext holds pod-level security attributes and common container settings.
Optional: Defaults to empty. See type description for default values of each field.

imagePullSecretsoptional

ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec.
If specified, these secrets will be passed to individual puller implementations for them to use.
More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod

hostnameoptional

string

Specifies the hostname of the Pod
If not specified, the pod's hostname will be set to a system-defined value.

subdomainoptional

string

If specified, the fully qualified Pod hostname will be "...svc.".
If not specified, the pod will not have a domainname at all.

affinityoptional

If specified, the pod's scheduling constraints

schedulerNameoptional

string

If specified, the pod will be dispatched by specified scheduler.
If not specified, the pod will be dispatched by default scheduler.

tolerationsoptional

Toleration array

If specified, the pod's tolerations.

hostAliasesoptional

HostAlias array

HostAliases is an optional list of hosts and IPs that will be injected into the pod's hosts
file if specified.

priorityClassNameoptional

string

If specified, indicates the pod's priority. "system-node-critical" and
"system-cluster-critical" are two special keywords which indicate the
highest priorities with the former being the highest priority. Any other
name must be defined by creating a PriorityClass object with that name.
If not specified, the pod priority will be default or zero if there is no
default.

priorityoptional

integer

The priority value. Various system components use this field to find the
priority of the pod. When Priority Admission Controller is enabled, it
prevents users from setting this field. The admission controller populates
this field from PriorityClassName.
The higher the value, the higher the priority.

dnsConfigoptional

Specifies the DNS parameters of a pod.
Parameters specified here will be merged to the generated DNS
configuration based on DNSPolicy.

readinessGatesoptional

If specified, all readiness gates will be evaluated for pod readiness.
A pod is ready when all its containers are ready AND
all conditions specified in the readiness gates have status equal to "True"
More info: https://git.k8s.io/enhancements/keps/sig-network/580-pod-readiness-gates

runtimeClassNameoptional

string

RuntimeClassName refers to a RuntimeClass object in the node.k8s.io group, which should be used
to run this pod. If no RuntimeClass resource matches the named class, the pod will not be run.
If unset or empty, the "legacy" RuntimeClass will be used, which is an implicit class with an
empty definition that uses the default runtime handler.
More info: https://git.k8s.io/enhancements/keps/sig-node/585-runtime-class

enableServiceLinksoptional

boolean

EnableServiceLinks indicates whether information about services should be injected into pod's
environment variables, matching the syntax of Docker links.
Optional: Defaults to true.

preemptionPolicyoptional

PreemptionPolicy is the Policy for preempting pods with lower priority.
One of Never, PreemptLowerPriority.
Defaults to PreemptLowerPriority if unset.

overheadoptional

Overhead represents the resource overhead associated with running a pod for a given RuntimeClass.
This field will be autopopulated at admission time by the RuntimeClass admission controller. If
the RuntimeClass admission controller is enabled, overhead must not be set in Pod create requests.
The RuntimeClass admission controller will reject Pod create requests which have the overhead already
set. If RuntimeClass is configured and selected in the PodSpec, Overhead will be set to the value
defined in the corresponding RuntimeClass, otherwise it will remain unset and treated as zero.
More info: https://git.k8s.io/enhancements/keps/sig-node/688-pod-overhead/README.md

topologySpreadConstraintsoptional

TopologySpreadConstraints describes how a group of pods ought to spread across topology
domains. Scheduler will schedule pods in a way which abides by the constraints.
All topologySpreadConstraints are ANDed.

setHostnameAsFQDNoptional

boolean

If true the pod's hostname will be configured as the pod's FQDN, rather than the leaf name (the default).
In Linux containers, this means setting the FQDN in the hostname field of the kernel (the nodename field of struct utsname).
In Windows containers, this means setting the registry value of hostname for the registry key HKEY_LOCAL_MACHINE\\SYSTEM\\CurrentControlSet\\Services\\Tcpip\\Parameters to FQDN.
If a pod does not have FQDN, this has no effect.
Default to false.

osoptional

Specifies the OS of the containers in the pod.
Some pod and container fields are restricted if this is set.

If the OS field is set to linux, the following fields must be unset:
-securityContext.windowsOptions

If the OS field is set to windows, following fields must be unset:
- spec.hostPID
- spec.hostIPC
- spec.hostUsers
- spec.securityContext.appArmorProfile
- spec.securityContext.seLinuxOptions
- spec.securityContext.seccompProfile
- spec.securityContext.fsGroup
- spec.securityContext.fsGroupChangePolicy
- spec.securityContext.sysctls
- spec.shareProcessNamespace
- spec.securityContext.runAsUser
- spec.securityContext.runAsGroup
- spec.securityContext.supplementalGroups
- spec.securityContext.supplementalGroupsPolicy
- spec.containers[*].securityContext.appArmorProfile
- spec.containers[*].securityContext.seLinuxOptions
- spec.containers[*].securityContext.seccompProfile
- spec.containers[*].securityContext.capabilities
- spec.containers[*].securityContext.readOnlyRootFilesystem
- spec.containers[*].securityContext.privileged
- spec.containers[*].securityContext.allowPrivilegeEscalation
- spec.containers[*].securityContext.procMount
- spec.containers[*].securityContext.runAsUser
- spec.containers[*].securityContext.runAsGroup

hostUsersoptional

boolean

Use the host's user namespace.
Optional: Default to true.
If set to true or not present, the pod will be run in the host user namespace, useful
for when the pod needs a feature only available to the host user namespace, such as
loading a kernel module with CAP_SYS_MODULE.
When set to false, a new userns is created for the pod. Setting false is useful for
mitigating container breakout vulnerabilities even allowing users to run their
containers as root without actually having root privileges on the host.
This field is alpha-level and is only honored by servers that enable the UserNamespacesSupport feature.

schedulingGatesoptional

SchedulingGates is an opaque list of values that if specified will block scheduling the pod.
If schedulingGates is not empty, the pod will stay in the SchedulingGated state and the
scheduler will not attempt to schedule the pod.

SchedulingGates can only be set at pod creation time, and be removed only afterwards.

resourceClaimsoptional

ResourceClaims defines which ResourceClaims must be allocated
and reserved before the Pod is allowed to start. The resources
will be made available to those containers which consume them
by name.

This is an alpha field and requires enabling the
DynamicResourceAllocation feature gate.

This field is immutable.

resourcesoptional

Resources is the total amount of CPU and Memory resources required by all
containers in the pod. It supports specifying Requests and Limits for
"cpu" and "memory" resource names only. ResourceClaims are not supported.

This field enables fine-grained control over resource allocation for the
entire pod, allowing resource sharing among containers in a pod.
TODO: For beta graduation, expand this comment with a detailed explanation.

This is an alpha field and requires enabling the PodLevelResources feature
gate.

minReplicasoptional

integer

Minimum number of replicas, defaults to 1 but can be set to 0 to enable scale-to-zero.

maxReplicasoptional

integer

Maximum number of replicas for autoscaling.

scaleTargetoptional

integer

ScaleTarget specifies the integer target value of the metric type the Autoscaler watches for.
concurrency and rps targets are supported by Knative Pod Autoscaler
(https://knative.dev/docs/serving/autoscaling/autoscaling-targets/).

scaleMetricoptional

ScaleMetric defines the scaling metric type watched by autoscaler.
possible values are concurrency, rps, cpu, memory. concurrency, rps are supported via
Knative Pod Autoscaler(https://knative.dev/docs/serving/autoscaling/autoscaling-metrics).

scaleMetricTypeoptional

Type of metric to use. Options are Utilization, or AverageValue.

autoScalingoptional

AutoScaling autoscaling spec which is backed up HPA or KEDA.

containerConcurrencyoptional

integer

ContainerConcurrency specifies how many requests can be processed concurrently, this sets the hard limit of the container
concurrency(https://knative.dev/docs/serving/autoscaling/concurrency).

timeoutoptional

integer

TimeoutSeconds specifies the number of seconds to wait before timing out a request to the component.

canaryTrafficPercentoptional

integer

CanaryTrafficPercent defines the traffic split percentage between the candidate revision and the last ready revision

loggeroptional

Activate request/response logging and logger configurations

batcheroptional

Activate request batching and batching configurations

labelsoptional

object (keys:string, values:string)

Labels that will be added to the component pod.
More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/

annotationsoptional

object (keys:string, values:string)

Annotations that will be added to the component pod.
More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/

deploymentStrategyoptional

The deployment strategy to use to replace existing pods with new ones. Only applicable for raw deployment mode.

artrequired

ExplainerConfig

authenticationRefrequired

AuthenticationRef

authenticationRef is a reference to the authentication information
for more information see: https://keda.sh/docs/2.17/scalers/prometheus/#authentication-parameters

authModesoptional

string

authModes defines the authentication modes for the metrics backend
possible values are bearer, basic, tls.
for more information see: https://keda.sh/docs/2.17/scalers/prometheus/#authentication-parameters

metricrequired

ExternalMetrics

metric identifies the target metric by name and selector

authenticationRefoptional

ExtMetricAuthentication

authenticationRef is a reference to the authentication information
for more information see: https://keda.sh/docs/2.17/scalers/prometheus/#authentication-parameters

targetrequired

MetricTarget

target specifies the target value for the given metric

backendoptional

MetricsBackend

MetricsBackend defines the scaling metric type watched by autoscaler
possible values are prometheus, graphite.

serverAddressoptional

string

Address of MetricsBackend server.

queryoptional

string

Query to run to get metrics from MetricsBackend

namespaceoptional

string

For namespaced query

locationoptional

string

Name of component to which the failure relates (usually Pod name)

reasonoptional

FailureReason

High level class of failure

messageoptional

string

Detailed error message

modelRevisionNameoptional

string

Internal Revision/ID of model, tied to specific Spec contents

timeoptional

Time

Time failure occurred or was discovered

exitCodeoptional

integer

Exit status from the last termination of the container

ModelLoadFailed

The model failed to load within a ServingRuntime container

RuntimeUnhealthy

Corresponding ServingRuntime containers failed to start or are unhealthy

RuntimeDisabled

The ServingRuntime is disabled

NoSupportingRuntime

There are no ServingRuntime which support the specified model type

RuntimeNotRecognized

There is no ServingRuntime defined with the specified runtime name

InvalidPredictorSpec

The current Predictor Spec is invalid or unsupported

storageUrioptional

string

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersionoptional

string

Runtime version of the predictor docker image

protocolVersionoptional

Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)

storageoptional

Storage Spec for model location

predictorrequired

PredictorSpec

Predictor defines the model serving spec

explaineroptional

ExplainerSpec

Explainer defines the model explanation service spec,
explainer service calls to predictor or transformer if it is specified.

transformeroptional

TransformerSpec

Transformer defines the pre/post processing before and after the predictor call,
transformer service calls to predictor service.

observedGenerationoptional

integer

ObservedGeneration is the 'Generation' of the Service that
was last processed by the controller.

conditionsoptional

Conditions the latest available observations of a resource's current state.

annotationsrequired

object (keys:string, values:string)

Annotations is additional Status fields for the Resource to save some
additional State as well as convey more information to the user. This is
roughly akin to Annotations on any k8s resource, just the reconciler conveying
richer information outwards.

addressoptional

Addressable endpoint for the InferenceService

urloptional

URL holds the url that will distribute traffic over the provided traffic targets.
It generally has the form http[s]://\{route-name\}.\{route-namespace\}.\{cluster-level-suffix\}

componentsrequired

object (keys:ComponentType, values:ComponentStatusSpec)

Statuses for the components of the InferenceService

modelStatusrequired

ModelStatus

Model related statuses

deploymentModerequired

string

InferenceService DeploymentMode

servingRuntimeNamerequired

string

ServingRuntimeName is the name of the ServingRuntime that the InferenceService is using

clusterServingRuntimeNamerequired

string

ClusterServingRuntimeName is the name of the ClusterServingRuntime that the InferenceService is using

storageUrioptional

string

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersionoptional

string

Runtime version of the predictor docker image

protocolVersionoptional

Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)

storageoptional

Storage Spec for model location

urloptional

string

URL to send logging events

modeoptional

LoggerType

Specifies the scope of the loggers.
Valid values are:
- "all" (default): log both request and response;
- "request": log only request;
- "response": log only response

metadataHeadersoptional

string array

Matched metadata HTTP headers for propagating to inference logger cloud events.

metadataAnnotationsoptional

string array

Matched inference service annotations for propagating to inference logger cloud events.

storageoptional

LoggerStorageSpec

Specifies the storage location for the inference logger cloud events.

pathoptional

string

The path to the object in the storage. Note that this path is relative to the storage URI.

parametersoptional

map[string]string

Parameters to override the default storage credentials and config.

keyoptional

string

The Storage Key in the secret for this object.

serviceAccountNamerequired

string

all

LogAll Logger mode to log both request and response

request

LogRequest Logger mode to log only request

response

LogResponse Logger mode to log only response

Resource

ResourceMetricSourceType is a resource metric known to Kubernetes, as
specified in requests and limits, describing each pod in the current
scale target (e.g. CPU or memory). Such metrics are built in to
Kubernetes, and have special scaling options on top of those available
to normal per-pod metrics (the "pods" source).

External

ExternalMetricSourceType is a global metric that is not associated
with any Kubernetes object. It allows autoscaling based on information
coming from components running outside of cluster
(for example length of queue in cloud messaging service, or
QPS from loadbalancer running outside of cluster).

PodMetric

PodMetricSourceType indicates a metric describing each pod in the current
scale target (for example, transactions-processed-per-second). The values
will be averaged together before being compared to the target value.

typeoptional

type represents whether the metric type is Utilization, Value, or AverageValue

valueoptional

value is the target value of the metric (as a quantity).

averageValueoptional

averageValue is the target value of the average of the
metric across all relevant pods (as a quantity)

averageUtilizationoptional

integer

averageUtilization is the target value of the average of the
resource metric across all relevant pods, represented as a percentage of
the requested value of the resource for the pods.
Currently only valid for Resource metric source type

Utilization

UtilizationMetricType declares a MetricTarget is an AverageUtilization value

Value

ValueMetricType declares a MetricTarget is a raw value

AverageValue

AverageValueMetricType declares a MetricTarget is an

prometheus

graphite

typerequired

MetricSourceType

type is the type of metric source. It should be one of "Resource", "External", "PodMetric".
"Resource" or "External" each mapping to a matching field in the object.

resourceoptional

ResourceMetricSource

resource refers to a resource metric (such as those specified in
requests and limits) known to Kubernetes describing each pod in the
current scale target (e.g. CPU or memory). Such metrics are built in to
Kubernetes, and have special scaling options on top of those available
to normal per-pod metrics using the "pods" source.

externaloptional

ExternalMetricSource

external refers to a global metric that is not associated
with any Kubernetes object. It allows autoscaling based on information
coming from components running outside of cluster
(for example length of queue in cloud messaging service, or
QPS from load balancer running outside of cluster).

podmetricoptional

PodMetricSource

pods refers to a metric describing each pod in the current scale target
(for example, transactions-processed-per-second). The values will be
averaged together before being compared to the target value.

failedCopiesrequired

integer

0

How many copies of this predictor's models failed to load recently

totalCopiesoptional

integer

Total number copies of this predictor's models that are currently loaded

namerequired

string

Name of the model format.

versionoptional

string

Version of the model format.
Used in validating that a predictor is supported by a runtime.
Can be "major", "major.minor" or "major.minor.patch".

activeModelStaterequired

ModelState

Pending

High level state string: Pending, Standby, Loading, Loaded, FailedToLoad

targetModelStaterequired

ModelState

modelFormatrequired

ModelFormat

ModelFormat being served.

runtimeoptional

string

Specific ClusterServingRuntime/ServingRuntime name to use for deployment.

storageUrioptional

string

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersionoptional

string

Runtime version of the predictor docker image

protocolVersionoptional

Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)

storageoptional

Storage Spec for model location

Pending

Model is not yet registered

Standby

Model is available but not loaded (will load when used)

Loading

Model is loading

Loaded

At least one copy of the model is loaded

FailedToLoad

All copies of the model failed to load

transitionStatusrequired

TransitionStatus

UpToDate

Whether the available predictor endpoints reflect the current Spec or is in transition

statesoptional

ModelRevisionStates

State information of the predictor's model.

lastFailureInfooptional

FailureInfo

Details of last failure, when load of target model is failed or blocked.

copiesoptional

ModelCopies

Model copy information of the predictor's model.

pathoptional

string

The path to the object in the storage. Note that this path is relative to the storage URI.

parametersoptional

map[string]string

Parameters to override the default storage credentials and config.

keyoptional

string

The Storage Key in the secret for this object.

schemaPathoptional

string

The path to the model schema file in the storage.

storageUrioptional

string

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersionoptional

string

Runtime version of the predictor docker image

protocolVersionoptional

Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)

storageoptional

Storage Spec for model location

storageUrioptional

string

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersionoptional

string

Runtime version of the predictor docker image

protocolVersionoptional

Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)

storageoptional

Storage Spec for model location

storageUrioptional

string

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersionoptional

string

Runtime version of the predictor docker image

protocolVersionoptional

Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)

storageoptional

Storage Spec for model location

metricrequired

PodMetrics

metric identifies the target metric by name and selector

targetrequired

MetricTarget

target specifies the target value for the given metric

backendoptional

PodsMetricsBackend

Backend defines the scaling metric type watched by the autoscaler.
Possible value: opentelemetry.

serverAddressoptional

string

ServerAddress specifies the address of the PodsMetricsBackend server.

metricNamesoptional

string array

MetricNames is the list of metric names in the backend.

queryoptional

string

Query specifies the query to run to get metrics from the PodsMetricsBackend.

operationOverTimeoptional

string

OperationOverTime specifies the operation to aggregate the metrics over time.
Possible values are last_one, avg, max, min, rate, count. Default is 'last_one'.

volumesoptional

Volume array

List of volumes that can be mounted by containers belonging to the pod.
More info: https://kubernetes.io/docs/concepts/storage/volumes

initContainersrequired

Container array

List of initialization containers belonging to the pod.
Init containers are executed in order prior to containers being started. If any
init container fails, the pod is considered to have failed and is handled according
to its restartPolicy. The name for an init container or normal container must be
unique among all containers.
Init containers may not have Lifecycle actions, Readiness probes, Liveness probes, or Startup probes.
The resourceRequirements of an init container are taken into account during scheduling
by finding the highest request/limit for each resource type, and then using the max of
that value or the sum of the normal containers. Limits are applied to init containers
in a similar fashion.
Init containers cannot currently be added or removed.
Cannot be updated.
More info: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

containersrequired

Container array

List of containers belonging to the pod.
Containers cannot currently be added or removed.
There must be at least one container in a Pod.
Cannot be updated.

ephemeralContainersoptional

List of ephemeral containers run in this pod. Ephemeral containers may be run in an existing
pod to perform user-initiated actions such as debugging. This list cannot be specified when
creating a pod, and it cannot be modified by updating the pod spec. In order to add an
ephemeral container to an existing pod, use the pod's ephemeralcontainers subresource.

restartPolicyoptional

Restart policy for all containers within the pod.
One of Always, OnFailure, Never. In some contexts, only a subset of those values may be permitted.
Default to Always.
More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy

terminationGracePeriodSecondsoptional

integer

Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request.
Value must be non-negative integer. The value zero indicates stop immediately via
the kill signal (no opportunity to shut down).
If this value is nil, the default grace period will be used instead.
The grace period is the duration in seconds after the processes running in the pod are sent
a termination signal and the time when the processes are forcibly halted with a kill signal.
Set this value longer than the expected cleanup time for your process.
Defaults to 30 seconds.

activeDeadlineSecondsoptional

integer

Optional duration in seconds the pod may be active on the node relative to
StartTime before the system will actively try to mark it failed and kill associated containers.
Value must be a positive integer.

dnsPolicyoptional

Set DNS policy for the pod.
Defaults to "ClusterFirst".
Valid values are 'ClusterFirstWithHostNet', 'ClusterFirst', 'Default' or 'None'.
DNS parameters given in DNSConfig will be merged with the policy selected with DNSPolicy.
To have DNS options set along with hostNetwork, you have to specify DNS policy
explicitly to 'ClusterFirstWithHostNet'.

nodeSelectoroptional

object (keys:string, values:string)

NodeSelector is a selector which must be true for the pod to fit on a node.
Selector which must match a node's labels for the pod to be scheduled on that node.
More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

serviceAccountNameoptional

string

ServiceAccountName is the name of the ServiceAccount to use to run this pod.
More info: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/

serviceAccountoptional

string

DeprecatedServiceAccount is a deprecated alias for ServiceAccountName.
Deprecated: Use serviceAccountName instead.

automountServiceAccountTokenoptional

boolean

AutomountServiceAccountToken indicates whether a service account token should be automatically mounted.

nodeNameoptional

string

NodeName indicates in which node this pod is scheduled.
If empty, this pod is a candidate for scheduling by the scheduler defined in schedulerName.
Once this field is set, the kubelet for this node becomes responsible for the lifecycle of this pod.
This field should not be used to express a desire for the pod to be scheduled on a specific node.
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodename

hostNetworkoptional

boolean

Host networking requested for this pod. Use the host's network namespace.
If this option is set, the ports that will be used must be specified.
Default to false.

hostPIDoptional

boolean

Use the host's pid namespace.
Optional: Default to false.

hostIPCoptional

boolean

Use the host's ipc namespace.
Optional: Default to false.

shareProcessNamespaceoptional

boolean

Share a single process namespace between all of the containers in a pod.
When this is set containers will be able to view and signal processes from other containers
in the same pod, and the first process in each container will not be assigned PID 1.
HostPID and ShareProcessNamespace cannot both be set.
Optional: Default to false.

securityContextoptional

SecurityContext holds pod-level security attributes and common container settings.
Optional: Defaults to empty. See type description for default values of each field.

imagePullSecretsoptional

ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec.
If specified, these secrets will be passed to individual puller implementations for them to use.
More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod

hostnameoptional

string

Specifies the hostname of the Pod
If not specified, the pod's hostname will be set to a system-defined value.

subdomainoptional

string

If specified, the fully qualified Pod hostname will be "...svc.".
If not specified, the pod will not have a domainname at all.

affinityoptional

If specified, the pod's scheduling constraints

schedulerNameoptional

string

If specified, the pod will be dispatched by specified scheduler.
If not specified, the pod will be dispatched by default scheduler.

tolerationsoptional

Toleration array

If specified, the pod's tolerations.

hostAliasesoptional

HostAlias array

HostAliases is an optional list of hosts and IPs that will be injected into the pod's hosts
file if specified.

priorityClassNameoptional

string

If specified, indicates the pod's priority. "system-node-critical" and
"system-cluster-critical" are two special keywords which indicate the
highest priorities with the former being the highest priority. Any other
name must be defined by creating a PriorityClass object with that name.
If not specified, the pod priority will be default or zero if there is no
default.

priorityoptional

integer

The priority value. Various system components use this field to find the
priority of the pod. When Priority Admission Controller is enabled, it
prevents users from setting this field. The admission controller populates
this field from PriorityClassName.
The higher the value, the higher the priority.

dnsConfigoptional

Specifies the DNS parameters of a pod.
Parameters specified here will be merged to the generated DNS
configuration based on DNSPolicy.

readinessGatesoptional

If specified, all readiness gates will be evaluated for pod readiness.
A pod is ready when all its containers are ready AND
all conditions specified in the readiness gates have status equal to "True"
More info: https://git.k8s.io/enhancements/keps/sig-network/580-pod-readiness-gates

runtimeClassNameoptional

string

RuntimeClassName refers to a RuntimeClass object in the node.k8s.io group, which should be used
to run this pod. If no RuntimeClass resource matches the named class, the pod will not be run.
If unset or empty, the "legacy" RuntimeClass will be used, which is an implicit class with an
empty definition that uses the default runtime handler.
More info: https://git.k8s.io/enhancements/keps/sig-node/585-runtime-class

enableServiceLinksoptional

boolean

EnableServiceLinks indicates whether information about services should be injected into pod's
environment variables, matching the syntax of Docker links.
Optional: Defaults to true.

preemptionPolicyoptional

PreemptionPolicy is the Policy for preempting pods with lower priority.
One of Never, PreemptLowerPriority.
Defaults to PreemptLowerPriority if unset.

overheadoptional

Overhead represents the resource overhead associated with running a pod for a given RuntimeClass.
This field will be autopopulated at admission time by the RuntimeClass admission controller. If
the RuntimeClass admission controller is enabled, overhead must not be set in Pod create requests.
The RuntimeClass admission controller will reject Pod create requests which have the overhead already
set. If RuntimeClass is configured and selected in the PodSpec, Overhead will be set to the value
defined in the corresponding RuntimeClass, otherwise it will remain unset and treated as zero.
More info: https://git.k8s.io/enhancements/keps/sig-node/688-pod-overhead/README.md

topologySpreadConstraintsoptional

TopologySpreadConstraints describes how a group of pods ought to spread across topology
domains. Scheduler will schedule pods in a way which abides by the constraints.
All topologySpreadConstraints are ANDed.

setHostnameAsFQDNoptional

boolean

If true the pod's hostname will be configured as the pod's FQDN, rather than the leaf name (the default).
In Linux containers, this means setting the FQDN in the hostname field of the kernel (the nodename field of struct utsname).
In Windows containers, this means setting the registry value of hostname for the registry key HKEY_LOCAL_MACHINE\\SYSTEM\\CurrentControlSet\\Services\\Tcpip\\Parameters to FQDN.
If a pod does not have FQDN, this has no effect.
Default to false.

osoptional

Specifies the OS of the containers in the pod.
Some pod and container fields are restricted if this is set.

If the OS field is set to linux, the following fields must be unset:
-securityContext.windowsOptions

If the OS field is set to windows, following fields must be unset:
- spec.hostPID
- spec.hostIPC
- spec.hostUsers
- spec.securityContext.appArmorProfile
- spec.securityContext.seLinuxOptions
- spec.securityContext.seccompProfile
- spec.securityContext.fsGroup
- spec.securityContext.fsGroupChangePolicy
- spec.securityContext.sysctls
- spec.shareProcessNamespace
- spec.securityContext.runAsUser
- spec.securityContext.runAsGroup
- spec.securityContext.supplementalGroups
- spec.securityContext.supplementalGroupsPolicy
- spec.containers[*].securityContext.appArmorProfile
- spec.containers[*].securityContext.seLinuxOptions
- spec.containers[*].securityContext.seccompProfile
- spec.containers[*].securityContext.capabilities
- spec.containers[*].securityContext.readOnlyRootFilesystem
- spec.containers[*].securityContext.privileged
- spec.containers[*].securityContext.allowPrivilegeEscalation
- spec.containers[*].securityContext.procMount
- spec.containers[*].securityContext.runAsUser
- spec.containers[*].securityContext.runAsGroup

hostUsersoptional

boolean

Use the host's user namespace.
Optional: Default to true.
If set to true or not present, the pod will be run in the host user namespace, useful
for when the pod needs a feature only available to the host user namespace, such as
loading a kernel module with CAP_SYS_MODULE.
When set to false, a new userns is created for the pod. Setting false is useful for
mitigating container breakout vulnerabilities even allowing users to run their
containers as root without actually having root privileges on the host.
This field is alpha-level and is only honored by servers that enable the UserNamespacesSupport feature.

schedulingGatesoptional

SchedulingGates is an opaque list of values that if specified will block scheduling the pod.
If schedulingGates is not empty, the pod will stay in the SchedulingGated state and the
scheduler will not attempt to schedule the pod.

SchedulingGates can only be set at pod creation time, and be removed only afterwards.

resourceClaimsoptional

ResourceClaims defines which ResourceClaims must be allocated
and reserved before the Pod is allowed to start. The resources
will be made available to those containers which consume them
by name.

This is an alpha field and requires enabling the
DynamicResourceAllocation feature gate.

This field is immutable.

resourcesoptional

Resources is the total amount of CPU and Memory resources required by all
containers in the pod. It supports specifying Requests and Limits for
"cpu" and "memory" resource names only. ResourceClaims are not supported.

This field enables fine-grained control over resource allocation for the
entire pod, allowing resource sharing among containers in a pod.
TODO: For beta graduation, expand this comment with a detailed explanation.

This is an alpha field and requires enabling the PodLevelResources feature
gate.

opentelemetry

storageUrioptional

string

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersionoptional

string

Runtime version of the predictor docker image

protocolVersionoptional

Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)

storageoptional

Storage Spec for model location

sklearnrequired

SKLearnSpec

Spec for SKLearn model server

xgboostrequired

XGBoostSpec

Spec for XGBoost model server

tensorflowrequired

TFServingSpec

Spec for TFServing (https://github.com/tensorflow/serving)

pytorchrequired

TorchServeSpec

Spec for TorchServe (https://pytorch.org/serve)

tritonrequired

TritonSpec

Spec for Triton Inference Server (https://github.com/triton-inference-server/server)

onnxrequired

ONNXRuntimeSpec

Spec for ONNX runtime (https://github.com/microsoft/onnxruntime)

huggingfacerequired

HuggingFaceRuntimeSpec

Spec for HuggingFace runtime (https://github.com/huggingface)

pmmlrequired

PMMLSpec

Spec for PMML (http://dmg.org/pmml/v4-1/GeneralStructure.html)

lightgbmrequired

LightGBMSpec

Spec for LightGBM model server

paddlerequired

PaddleServerSpec

Spec for Paddle model server (https://github.com/PaddlePaddle/Serving)

modelrequired

ModelSpec

Model spec for any arbitrary framework.

workerSpecrequired

WorkerSpec

WorkerSpec for enabling multi-node/multi-gpu

volumesoptional

Volume array

List of volumes that can be mounted by containers belonging to the pod.
More info: https://kubernetes.io/docs/concepts/storage/volumes

initContainersrequired

Container array

List of initialization containers belonging to the pod.
Init containers are executed in order prior to containers being started. If any
init container fails, the pod is considered to have failed and is handled according
to its restartPolicy. The name for an init container or normal container must be
unique among all containers.
Init containers may not have Lifecycle actions, Readiness probes, Liveness probes, or Startup probes.
The resourceRequirements of an init container are taken into account during scheduling
by finding the highest request/limit for each resource type, and then using the max of
that value or the sum of the normal containers. Limits are applied to init containers
in a similar fashion.
Init containers cannot currently be added or removed.
Cannot be updated.
More info: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

containersrequired

Container array

List of containers belonging to the pod.
Containers cannot currently be added or removed.
There must be at least one container in a Pod.
Cannot be updated.

ephemeralContainersoptional

List of ephemeral containers run in this pod. Ephemeral containers may be run in an existing
pod to perform user-initiated actions such as debugging. This list cannot be specified when
creating a pod, and it cannot be modified by updating the pod spec. In order to add an
ephemeral container to an existing pod, use the pod's ephemeralcontainers subresource.

restartPolicyoptional

Restart policy for all containers within the pod.
One of Always, OnFailure, Never. In some contexts, only a subset of those values may be permitted.
Default to Always.
More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy

terminationGracePeriodSecondsoptional

integer

Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request.
Value must be non-negative integer. The value zero indicates stop immediately via
the kill signal (no opportunity to shut down).
If this value is nil, the default grace period will be used instead.
The grace period is the duration in seconds after the processes running in the pod are sent
a termination signal and the time when the processes are forcibly halted with a kill signal.
Set this value longer than the expected cleanup time for your process.
Defaults to 30 seconds.

activeDeadlineSecondsoptional

integer

Optional duration in seconds the pod may be active on the node relative to
StartTime before the system will actively try to mark it failed and kill associated containers.
Value must be a positive integer.

dnsPolicyoptional

Set DNS policy for the pod.
Defaults to "ClusterFirst".
Valid values are 'ClusterFirstWithHostNet', 'ClusterFirst', 'Default' or 'None'.
DNS parameters given in DNSConfig will be merged with the policy selected with DNSPolicy.
To have DNS options set along with hostNetwork, you have to specify DNS policy
explicitly to 'ClusterFirstWithHostNet'.

nodeSelectoroptional

object (keys:string, values:string)

NodeSelector is a selector which must be true for the pod to fit on a node.
Selector which must match a node's labels for the pod to be scheduled on that node.
More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

serviceAccountNameoptional

string

ServiceAccountName is the name of the ServiceAccount to use to run this pod.
More info: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/

serviceAccountoptional

string

DeprecatedServiceAccount is a deprecated alias for ServiceAccountName.
Deprecated: Use serviceAccountName instead.

automountServiceAccountTokenoptional

boolean

AutomountServiceAccountToken indicates whether a service account token should be automatically mounted.

nodeNameoptional

string

NodeName indicates in which node this pod is scheduled.
If empty, this pod is a candidate for scheduling by the scheduler defined in schedulerName.
Once this field is set, the kubelet for this node becomes responsible for the lifecycle of this pod.
This field should not be used to express a desire for the pod to be scheduled on a specific node.
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodename

hostNetworkoptional

boolean

Host networking requested for this pod. Use the host's network namespace.
If this option is set, the ports that will be used must be specified.
Default to false.

hostPIDoptional

boolean

Use the host's pid namespace.
Optional: Default to false.

hostIPCoptional

boolean

Use the host's ipc namespace.
Optional: Default to false.

shareProcessNamespaceoptional

boolean

Share a single process namespace between all of the containers in a pod.
When this is set containers will be able to view and signal processes from other containers
in the same pod, and the first process in each container will not be assigned PID 1.
HostPID and ShareProcessNamespace cannot both be set.
Optional: Default to false.

securityContextoptional

SecurityContext holds pod-level security attributes and common container settings.
Optional: Defaults to empty. See type description for default values of each field.

imagePullSecretsoptional

ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec.
If specified, these secrets will be passed to individual puller implementations for them to use.
More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod

hostnameoptional

string

Specifies the hostname of the Pod
If not specified, the pod's hostname will be set to a system-defined value.

subdomainoptional

string

If specified, the fully qualified Pod hostname will be "...svc.".
If not specified, the pod will not have a domainname at all.

affinityoptional

If specified, the pod's scheduling constraints

schedulerNameoptional

string

If specified, the pod will be dispatched by specified scheduler.
If not specified, the pod will be dispatched by default scheduler.

tolerationsoptional

Toleration array

If specified, the pod's tolerations.

hostAliasesoptional

HostAlias array

HostAliases is an optional list of hosts and IPs that will be injected into the pod's hosts
file if specified.

priorityClassNameoptional

string

If specified, indicates the pod's priority. "system-node-critical" and
"system-cluster-critical" are two special keywords which indicate the
highest priorities with the former being the highest priority. Any other
name must be defined by creating a PriorityClass object with that name.
If not specified, the pod priority will be default or zero if there is no
default.

priorityoptional

integer

The priority value. Various system components use this field to find the
priority of the pod. When Priority Admission Controller is enabled, it
prevents users from setting this field. The admission controller populates
this field from PriorityClassName.
The higher the value, the higher the priority.

dnsConfigoptional

Specifies the DNS parameters of a pod.
Parameters specified here will be merged to the generated DNS
configuration based on DNSPolicy.

readinessGatesoptional

If specified, all readiness gates will be evaluated for pod readiness.
A pod is ready when all its containers are ready AND
all conditions specified in the readiness gates have status equal to "True"
More info: https://git.k8s.io/enhancements/keps/sig-network/580-pod-readiness-gates

runtimeClassNameoptional

string

RuntimeClassName refers to a RuntimeClass object in the node.k8s.io group, which should be used
to run this pod. If no RuntimeClass resource matches the named class, the pod will not be run.
If unset or empty, the "legacy" RuntimeClass will be used, which is an implicit class with an
empty definition that uses the default runtime handler.
More info: https://git.k8s.io/enhancements/keps/sig-node/585-runtime-class

enableServiceLinksoptional

boolean

EnableServiceLinks indicates whether information about services should be injected into pod's
environment variables, matching the syntax of Docker links.
Optional: Defaults to true.

preemptionPolicyoptional

PreemptionPolicy is the Policy for preempting pods with lower priority.
One of Never, PreemptLowerPriority.
Defaults to PreemptLowerPriority if unset.

overheadoptional

Overhead represents the resource overhead associated with running a pod for a given RuntimeClass.
This field will be autopopulated at admission time by the RuntimeClass admission controller. If
the RuntimeClass admission controller is enabled, overhead must not be set in Pod create requests.
The RuntimeClass admission controller will reject Pod create requests which have the overhead already
set. If RuntimeClass is configured and selected in the PodSpec, Overhead will be set to the value
defined in the corresponding RuntimeClass, otherwise it will remain unset and treated as zero.
More info: https://git.k8s.io/enhancements/keps/sig-node/688-pod-overhead/README.md

topologySpreadConstraintsoptional

TopologySpreadConstraints describes how a group of pods ought to spread across topology
domains. Scheduler will schedule pods in a way which abides by the constraints.
All topologySpreadConstraints are ANDed.

setHostnameAsFQDNoptional

boolean

If true the pod's hostname will be configured as the pod's FQDN, rather than the leaf name (the default).
In Linux containers, this means setting the FQDN in the hostname field of the kernel (the nodename field of struct utsname).
In Windows containers, this means setting the registry value of hostname for the registry key HKEY_LOCAL_MACHINE\\SYSTEM\\CurrentControlSet\\Services\\Tcpip\\Parameters to FQDN.
If a pod does not have FQDN, this has no effect.
Default to false.

osoptional

Specifies the OS of the containers in the pod.
Some pod and container fields are restricted if this is set.

If the OS field is set to linux, the following fields must be unset:
-securityContext.windowsOptions

If the OS field is set to windows, following fields must be unset:
- spec.hostPID
- spec.hostIPC
- spec.hostUsers
- spec.securityContext.appArmorProfile
- spec.securityContext.seLinuxOptions
- spec.securityContext.seccompProfile
- spec.securityContext.fsGroup
- spec.securityContext.fsGroupChangePolicy
- spec.securityContext.sysctls
- spec.shareProcessNamespace
- spec.securityContext.runAsUser
- spec.securityContext.runAsGroup
- spec.securityContext.supplementalGroups
- spec.securityContext.supplementalGroupsPolicy
- spec.containers[*].securityContext.appArmorProfile
- spec.containers[*].securityContext.seLinuxOptions
- spec.containers[*].securityContext.seccompProfile
- spec.containers[*].securityContext.capabilities
- spec.containers[*].securityContext.readOnlyRootFilesystem
- spec.containers[*].securityContext.privileged
- spec.containers[*].securityContext.allowPrivilegeEscalation
- spec.containers[*].securityContext.procMount
- spec.containers[*].securityContext.runAsUser
- spec.containers[*].securityContext.runAsGroup

hostUsersoptional

boolean

Use the host's user namespace.
Optional: Default to true.
If set to true or not present, the pod will be run in the host user namespace, useful
for when the pod needs a feature only available to the host user namespace, such as
loading a kernel module with CAP_SYS_MODULE.
When set to false, a new userns is created for the pod. Setting false is useful for
mitigating container breakout vulnerabilities even allowing users to run their
containers as root without actually having root privileges on the host.
This field is alpha-level and is only honored by servers that enable the UserNamespacesSupport feature.

schedulingGatesoptional

SchedulingGates is an opaque list of values that if specified will block scheduling the pod.
If schedulingGates is not empty, the pod will stay in the SchedulingGated state and the
scheduler will not attempt to schedule the pod.

SchedulingGates can only be set at pod creation time, and be removed only afterwards.

resourceClaimsoptional

ResourceClaims defines which ResourceClaims must be allocated
and reserved before the Pod is allowed to start. The resources
will be made available to those containers which consume them
by name.

This is an alpha field and requires enabling the
DynamicResourceAllocation feature gate.

This field is immutable.

resourcesoptional

Resources is the total amount of CPU and Memory resources required by all
containers in the pod. It supports specifying Requests and Limits for
"cpu" and "memory" resource names only. ResourceClaims are not supported.

This field enables fine-grained control over resource allocation for the
entire pod, allowing resource sharing among containers in a pod.
TODO: For beta graduation, expand this comment with a detailed explanation.

This is an alpha field and requires enabling the PodLevelResources feature
gate.

minReplicasoptional

integer

Minimum number of replicas, defaults to 1 but can be set to 0 to enable scale-to-zero.

maxReplicasoptional

integer

Maximum number of replicas for autoscaling.

scaleTargetoptional

integer

ScaleTarget specifies the integer target value of the metric type the Autoscaler watches for.
concurrency and rps targets are supported by Knative Pod Autoscaler
(https://knative.dev/docs/serving/autoscaling/autoscaling-targets/).

scaleMetricoptional

ScaleMetric defines the scaling metric type watched by autoscaler.
possible values are concurrency, rps, cpu, memory. concurrency, rps are supported via
Knative Pod Autoscaler(https://knative.dev/docs/serving/autoscaling/autoscaling-metrics).

scaleMetricTypeoptional

Type of metric to use. Options are Utilization, or AverageValue.

autoScalingoptional

AutoScaling autoscaling spec which is backed up HPA or KEDA.

containerConcurrencyoptional

integer

ContainerConcurrency specifies how many requests can be processed concurrently, this sets the hard limit of the container
concurrency(https://knative.dev/docs/serving/autoscaling/concurrency).

timeoutoptional

integer

TimeoutSeconds specifies the number of seconds to wait before timing out a request to the component.

canaryTrafficPercentoptional

integer

CanaryTrafficPercent defines the traffic split percentage between the candidate revision and the last ready revision

loggeroptional

Activate request/response logging and logger configurations

batcheroptional

Activate request batching and batching configurations

labelsoptional

object (keys:string, values:string)

Labels that will be added to the component pod.
More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/

annotationsoptional

object (keys:string, values:string)

Annotations that will be added to the component pod.
More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/

deploymentStrategyoptional

The deployment strategy to use to replace existing pods with new ones. Only applicable for raw deployment mode.

cpuLimitrequired

string

memoryLimitrequired

string

cpuRequestrequired

string

memoryRequestrequired

string

cpu

memory

namerequired

ResourceMetric

name is the name of the resource in question.

targetrequired

MetricTarget

target specifies the target value for the given metric

storageUrioptional

string

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersionoptional

string

Runtime version of the predictor docker image

protocolVersionoptional

Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)

storageoptional

Storage Spec for model location

cpu

memory

concurrency

rps

pathoptional

string

The path to the object in the storage. Note that this path is relative to the storage URI.

parametersoptional

map[string]string

Parameters to override the default storage credentials and config.

keyoptional

string

The Storage Key in the secret for this object.

storageUrioptional

string

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersionoptional

string

Runtime version of the predictor docker image

protocolVersionoptional

Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)

storageoptional

Storage Spec for model location

storageUrioptional

string

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersionoptional

string

Runtime version of the predictor docker image

protocolVersionoptional

Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)

storageoptional

Storage Spec for model location

volumesoptional

Volume array

List of volumes that can be mounted by containers belonging to the pod.
More info: https://kubernetes.io/docs/concepts/storage/volumes

initContainersrequired

Container array

List of initialization containers belonging to the pod.
Init containers are executed in order prior to containers being started. If any
init container fails, the pod is considered to have failed and is handled according
to its restartPolicy. The name for an init container or normal container must be
unique among all containers.
Init containers may not have Lifecycle actions, Readiness probes, Liveness probes, or Startup probes.
The resourceRequirements of an init container are taken into account during scheduling
by finding the highest request/limit for each resource type, and then using the max of
that value or the sum of the normal containers. Limits are applied to init containers
in a similar fashion.
Init containers cannot currently be added or removed.
Cannot be updated.
More info: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

containersrequired

Container array

List of containers belonging to the pod.
Containers cannot currently be added or removed.
There must be at least one container in a Pod.
Cannot be updated.

ephemeralContainersoptional

List of ephemeral containers run in this pod. Ephemeral containers may be run in an existing
pod to perform user-initiated actions such as debugging. This list cannot be specified when
creating a pod, and it cannot be modified by updating the pod spec. In order to add an
ephemeral container to an existing pod, use the pod's ephemeralcontainers subresource.

restartPolicyoptional

Restart policy for all containers within the pod.
One of Always, OnFailure, Never. In some contexts, only a subset of those values may be permitted.
Default to Always.
More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy

terminationGracePeriodSecondsoptional

integer

Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request.
Value must be non-negative integer. The value zero indicates stop immediately via
the kill signal (no opportunity to shut down).
If this value is nil, the default grace period will be used instead.
The grace period is the duration in seconds after the processes running in the pod are sent
a termination signal and the time when the processes are forcibly halted with a kill signal.
Set this value longer than the expected cleanup time for your process.
Defaults to 30 seconds.

activeDeadlineSecondsoptional

integer

Optional duration in seconds the pod may be active on the node relative to
StartTime before the system will actively try to mark it failed and kill associated containers.
Value must be a positive integer.

dnsPolicyoptional

Set DNS policy for the pod.
Defaults to "ClusterFirst".
Valid values are 'ClusterFirstWithHostNet', 'ClusterFirst', 'Default' or 'None'.
DNS parameters given in DNSConfig will be merged with the policy selected with DNSPolicy.
To have DNS options set along with hostNetwork, you have to specify DNS policy
explicitly to 'ClusterFirstWithHostNet'.

nodeSelectoroptional

object (keys:string, values:string)

NodeSelector is a selector which must be true for the pod to fit on a node.
Selector which must match a node's labels for the pod to be scheduled on that node.
More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

serviceAccountNameoptional

string

ServiceAccountName is the name of the ServiceAccount to use to run this pod.
More info: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/

serviceAccountoptional

string

DeprecatedServiceAccount is a deprecated alias for ServiceAccountName.
Deprecated: Use serviceAccountName instead.

automountServiceAccountTokenoptional

boolean

AutomountServiceAccountToken indicates whether a service account token should be automatically mounted.

nodeNameoptional

string

NodeName indicates in which node this pod is scheduled.
If empty, this pod is a candidate for scheduling by the scheduler defined in schedulerName.
Once this field is set, the kubelet for this node becomes responsible for the lifecycle of this pod.
This field should not be used to express a desire for the pod to be scheduled on a specific node.
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodename

hostNetworkoptional

boolean

Host networking requested for this pod. Use the host's network namespace.
If this option is set, the ports that will be used must be specified.
Default to false.

hostPIDoptional

boolean

Use the host's pid namespace.
Optional: Default to false.

hostIPCoptional

boolean

Use the host's ipc namespace.
Optional: Default to false.

shareProcessNamespaceoptional

boolean

Share a single process namespace between all of the containers in a pod.
When this is set containers will be able to view and signal processes from other containers
in the same pod, and the first process in each container will not be assigned PID 1.
HostPID and ShareProcessNamespace cannot both be set.
Optional: Default to false.

securityContextoptional

SecurityContext holds pod-level security attributes and common container settings.
Optional: Defaults to empty. See type description for default values of each field.

imagePullSecretsoptional

ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec.
If specified, these secrets will be passed to individual puller implementations for them to use.
More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod

hostnameoptional

string

Specifies the hostname of the Pod
If not specified, the pod's hostname will be set to a system-defined value.

subdomainoptional

string

If specified, the fully qualified Pod hostname will be "...svc.".
If not specified, the pod will not have a domainname at all.

affinityoptional

If specified, the pod's scheduling constraints

schedulerNameoptional

string

If specified, the pod will be dispatched by specified scheduler.
If not specified, the pod will be dispatched by default scheduler.

tolerationsoptional

Toleration array

If specified, the pod's tolerations.

hostAliasesoptional

HostAlias array

HostAliases is an optional list of hosts and IPs that will be injected into the pod's hosts
file if specified.

priorityClassNameoptional

string

If specified, indicates the pod's priority. "system-node-critical" and
"system-cluster-critical" are two special keywords which indicate the
highest priorities with the former being the highest priority. Any other
name must be defined by creating a PriorityClass object with that name.
If not specified, the pod priority will be default or zero if there is no
default.

priorityoptional

integer

The priority value. Various system components use this field to find the
priority of the pod. When Priority Admission Controller is enabled, it
prevents users from setting this field. The admission controller populates
this field from PriorityClassName.
The higher the value, the higher the priority.

dnsConfigoptional

Specifies the DNS parameters of a pod.
Parameters specified here will be merged to the generated DNS
configuration based on DNSPolicy.

readinessGatesoptional

If specified, all readiness gates will be evaluated for pod readiness.
A pod is ready when all its containers are ready AND
all conditions specified in the readiness gates have status equal to "True"
More info: https://git.k8s.io/enhancements/keps/sig-network/580-pod-readiness-gates

runtimeClassNameoptional

string

RuntimeClassName refers to a RuntimeClass object in the node.k8s.io group, which should be used
to run this pod. If no RuntimeClass resource matches the named class, the pod will not be run.
If unset or empty, the "legacy" RuntimeClass will be used, which is an implicit class with an
empty definition that uses the default runtime handler.
More info: https://git.k8s.io/enhancements/keps/sig-node/585-runtime-class

enableServiceLinksoptional

boolean

EnableServiceLinks indicates whether information about services should be injected into pod's
environment variables, matching the syntax of Docker links.
Optional: Defaults to true.

preemptionPolicyoptional

PreemptionPolicy is the Policy for preempting pods with lower priority.
One of Never, PreemptLowerPriority.
Defaults to PreemptLowerPriority if unset.

overheadoptional

Overhead represents the resource overhead associated with running a pod for a given RuntimeClass.
This field will be autopopulated at admission time by the RuntimeClass admission controller. If
the RuntimeClass admission controller is enabled, overhead must not be set in Pod create requests.
The RuntimeClass admission controller will reject Pod create requests which have the overhead already
set. If RuntimeClass is configured and selected in the PodSpec, Overhead will be set to the value
defined in the corresponding RuntimeClass, otherwise it will remain unset and treated as zero.
More info: https://git.k8s.io/enhancements/keps/sig-node/688-pod-overhead/README.md

topologySpreadConstraintsoptional

TopologySpreadConstraints describes how a group of pods ought to spread across topology
domains. Scheduler will schedule pods in a way which abides by the constraints.
All topologySpreadConstraints are ANDed.

setHostnameAsFQDNoptional

boolean

If true the pod's hostname will be configured as the pod's FQDN, rather than the leaf name (the default).
In Linux containers, this means setting the FQDN in the hostname field of the kernel (the nodename field of struct utsname).
In Windows containers, this means setting the registry value of hostname for the registry key HKEY_LOCAL_MACHINE\\SYSTEM\\CurrentControlSet\\Services\\Tcpip\\Parameters to FQDN.
If a pod does not have FQDN, this has no effect.
Default to false.

osoptional

Specifies the OS of the containers in the pod.
Some pod and container fields are restricted if this is set.

If the OS field is set to linux, the following fields must be unset:
-securityContext.windowsOptions

If the OS field is set to windows, following fields must be unset:
- spec.hostPID
- spec.hostIPC
- spec.hostUsers
- spec.securityContext.appArmorProfile
- spec.securityContext.seLinuxOptions
- spec.securityContext.seccompProfile
- spec.securityContext.fsGroup
- spec.securityContext.fsGroupChangePolicy
- spec.securityContext.sysctls
- spec.shareProcessNamespace
- spec.securityContext.runAsUser
- spec.securityContext.runAsGroup
- spec.securityContext.supplementalGroups
- spec.securityContext.supplementalGroupsPolicy
- spec.containers[*].securityContext.appArmorProfile
- spec.containers[*].securityContext.seLinuxOptions
- spec.containers[*].securityContext.seccompProfile
- spec.containers[*].securityContext.capabilities
- spec.containers[*].securityContext.readOnlyRootFilesystem
- spec.containers[*].securityContext.privileged
- spec.containers[*].securityContext.allowPrivilegeEscalation
- spec.containers[*].securityContext.procMount
- spec.containers[*].securityContext.runAsUser
- spec.containers[*].securityContext.runAsGroup

hostUsersoptional

boolean

Use the host's user namespace.
Optional: Default to true.
If set to true or not present, the pod will be run in the host user namespace, useful
for when the pod needs a feature only available to the host user namespace, such as
loading a kernel module with CAP_SYS_MODULE.
When set to false, a new userns is created for the pod. Setting false is useful for
mitigating container breakout vulnerabilities even allowing users to run their
containers as root without actually having root privileges on the host.
This field is alpha-level and is only honored by servers that enable the UserNamespacesSupport feature.

schedulingGatesoptional

SchedulingGates is an opaque list of values that if specified will block scheduling the pod.
If schedulingGates is not empty, the pod will stay in the SchedulingGated state and the
scheduler will not attempt to schedule the pod.

SchedulingGates can only be set at pod creation time, and be removed only afterwards.

resourceClaimsoptional

ResourceClaims defines which ResourceClaims must be allocated
and reserved before the Pod is allowed to start. The resources
will be made available to those containers which consume them
by name.

This is an alpha field and requires enabling the
DynamicResourceAllocation feature gate.

This field is immutable.

resourcesoptional

Resources is the total amount of CPU and Memory resources required by all
containers in the pod. It supports specifying Requests and Limits for
"cpu" and "memory" resource names only. ResourceClaims are not supported.

This field enables fine-grained control over resource allocation for the
entire pod, allowing resource sharing among containers in a pod.
TODO: For beta graduation, expand this comment with a detailed explanation.

This is an alpha field and requires enabling the PodLevelResources feature
gate.

minReplicasoptional

integer

Minimum number of replicas, defaults to 1 but can be set to 0 to enable scale-to-zero.

maxReplicasoptional

integer

Maximum number of replicas for autoscaling.

scaleTargetoptional

integer

ScaleTarget specifies the integer target value of the metric type the Autoscaler watches for.
concurrency and rps targets are supported by Knative Pod Autoscaler
(https://knative.dev/docs/serving/autoscaling/autoscaling-targets/).

scaleMetricoptional

ScaleMetric defines the scaling metric type watched by autoscaler.
possible values are concurrency, rps, cpu, memory. concurrency, rps are supported via
Knative Pod Autoscaler(https://knative.dev/docs/serving/autoscaling/autoscaling-metrics).

scaleMetricTypeoptional

Type of metric to use. Options are Utilization, or AverageValue.

autoScalingoptional

AutoScaling autoscaling spec which is backed up HPA or KEDA.

containerConcurrencyoptional

integer

ContainerConcurrency specifies how many requests can be processed concurrently, this sets the hard limit of the container
concurrency(https://knative.dev/docs/serving/autoscaling/concurrency).

timeoutoptional

integer

TimeoutSeconds specifies the number of seconds to wait before timing out a request to the component.

canaryTrafficPercentoptional

integer

CanaryTrafficPercent defines the traffic split percentage between the candidate revision and the last ready revision

loggeroptional

Activate request/response logging and logger configurations

batcheroptional

Activate request batching and batching configurations

labelsoptional

object (keys:string, values:string)

Labels that will be added to the component pod.
More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/

annotationsoptional

object (keys:string, values:string)

Annotations that will be added to the component pod.
More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/

deploymentStrategyoptional

The deployment strategy to use to replace existing pods with new ones. Only applicable for raw deployment mode.

UpToDate

Predictor is up-to-date (reflects current spec)

InProgress

Waiting for target model to reach state of active model

BlockedByFailedLoad

Target model failed to load

InvalidSpec

Target predictor spec failed validation

storageUrioptional

string

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersionoptional

string

Runtime version of the predictor docker image

protocolVersionoptional

Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)

storageoptional

Storage Spec for model location

volumesoptional

Volume array

List of volumes that can be mounted by containers belonging to the pod.
More info: https://kubernetes.io/docs/concepts/storage/volumes

initContainersrequired

Container array

List of initialization containers belonging to the pod.
Init containers are executed in order prior to containers being started. If any
init container fails, the pod is considered to have failed and is handled according
to its restartPolicy. The name for an init container or normal container must be
unique among all containers.
Init containers may not have Lifecycle actions, Readiness probes, Liveness probes, or Startup probes.
The resourceRequirements of an init container are taken into account during scheduling
by finding the highest request/limit for each resource type, and then using the max of
that value or the sum of the normal containers. Limits are applied to init containers
in a similar fashion.
Init containers cannot currently be added or removed.
Cannot be updated.
More info: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

containersrequired

Container array

List of containers belonging to the pod.
Containers cannot currently be added or removed.
There must be at least one container in a Pod.
Cannot be updated.

ephemeralContainersoptional

List of ephemeral containers run in this pod. Ephemeral containers may be run in an existing
pod to perform user-initiated actions such as debugging. This list cannot be specified when
creating a pod, and it cannot be modified by updating the pod spec. In order to add an
ephemeral container to an existing pod, use the pod's ephemeralcontainers subresource.

restartPolicyoptional

Restart policy for all containers within the pod.
One of Always, OnFailure, Never. In some contexts, only a subset of those values may be permitted.
Default to Always.
More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy

terminationGracePeriodSecondsoptional

integer

Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request.
Value must be non-negative integer. The value zero indicates stop immediately via
the kill signal (no opportunity to shut down).
If this value is nil, the default grace period will be used instead.
The grace period is the duration in seconds after the processes running in the pod are sent
a termination signal and the time when the processes are forcibly halted with a kill signal.
Set this value longer than the expected cleanup time for your process.
Defaults to 30 seconds.

activeDeadlineSecondsoptional

integer

Optional duration in seconds the pod may be active on the node relative to
StartTime before the system will actively try to mark it failed and kill associated containers.
Value must be a positive integer.

dnsPolicyoptional

Set DNS policy for the pod.
Defaults to "ClusterFirst".
Valid values are 'ClusterFirstWithHostNet', 'ClusterFirst', 'Default' or 'None'.
DNS parameters given in DNSConfig will be merged with the policy selected with DNSPolicy.
To have DNS options set along with hostNetwork, you have to specify DNS policy
explicitly to 'ClusterFirstWithHostNet'.

nodeSelectoroptional

object (keys:string, values:string)

NodeSelector is a selector which must be true for the pod to fit on a node.
Selector which must match a node's labels for the pod to be scheduled on that node.
More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

serviceAccountNameoptional

string

ServiceAccountName is the name of the ServiceAccount to use to run this pod.
More info: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/

serviceAccountoptional

string

DeprecatedServiceAccount is a deprecated alias for ServiceAccountName.
Deprecated: Use serviceAccountName instead.

automountServiceAccountTokenoptional

boolean

AutomountServiceAccountToken indicates whether a service account token should be automatically mounted.

nodeNameoptional

string

NodeName indicates in which node this pod is scheduled.
If empty, this pod is a candidate for scheduling by the scheduler defined in schedulerName.
Once this field is set, the kubelet for this node becomes responsible for the lifecycle of this pod.
This field should not be used to express a desire for the pod to be scheduled on a specific node.
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodename

hostNetworkoptional

boolean

Host networking requested for this pod. Use the host's network namespace.
If this option is set, the ports that will be used must be specified.
Default to false.

hostPIDoptional

boolean

Use the host's pid namespace.
Optional: Default to false.

hostIPCoptional

boolean

Use the host's ipc namespace.
Optional: Default to false.

shareProcessNamespaceoptional

boolean

Share a single process namespace between all of the containers in a pod.
When this is set containers will be able to view and signal processes from other containers
in the same pod, and the first process in each container will not be assigned PID 1.
HostPID and ShareProcessNamespace cannot both be set.
Optional: Default to false.

securityContextoptional

SecurityContext holds pod-level security attributes and common container settings.
Optional: Defaults to empty. See type description for default values of each field.

imagePullSecretsoptional

ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec.
If specified, these secrets will be passed to individual puller implementations for them to use.
More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod

hostnameoptional

string

Specifies the hostname of the Pod
If not specified, the pod's hostname will be set to a system-defined value.

subdomainoptional

string

If specified, the fully qualified Pod hostname will be "...svc.".
If not specified, the pod will not have a domainname at all.

affinityoptional

If specified, the pod's scheduling constraints

schedulerNameoptional

string

If specified, the pod will be dispatched by specified scheduler.
If not specified, the pod will be dispatched by default scheduler.

tolerationsoptional

Toleration array

If specified, the pod's tolerations.

hostAliasesoptional

HostAlias array

HostAliases is an optional list of hosts and IPs that will be injected into the pod's hosts
file if specified.

priorityClassNameoptional

string

If specified, indicates the pod's priority. "system-node-critical" and
"system-cluster-critical" are two special keywords which indicate the
highest priorities with the former being the highest priority. Any other
name must be defined by creating a PriorityClass object with that name.
If not specified, the pod priority will be default or zero if there is no
default.

priorityoptional

integer

The priority value. Various system components use this field to find the
priority of the pod. When Priority Admission Controller is enabled, it
prevents users from setting this field. The admission controller populates
this field from PriorityClassName.
The higher the value, the higher the priority.

dnsConfigoptional

Specifies the DNS parameters of a pod.
Parameters specified here will be merged to the generated DNS
configuration based on DNSPolicy.

readinessGatesoptional

If specified, all readiness gates will be evaluated for pod readiness.
A pod is ready when all its containers are ready AND
all conditions specified in the readiness gates have status equal to "True"
More info: https://git.k8s.io/enhancements/keps/sig-network/580-pod-readiness-gates

runtimeClassNameoptional

string

RuntimeClassName refers to a RuntimeClass object in the node.k8s.io group, which should be used
to run this pod. If no RuntimeClass resource matches the named class, the pod will not be run.
If unset or empty, the "legacy" RuntimeClass will be used, which is an implicit class with an
empty definition that uses the default runtime handler.
More info: https://git.k8s.io/enhancements/keps/sig-node/585-runtime-class

enableServiceLinksoptional

boolean

EnableServiceLinks indicates whether information about services should be injected into pod's
environment variables, matching the syntax of Docker links.
Optional: Defaults to true.

preemptionPolicyoptional

PreemptionPolicy is the Policy for preempting pods with lower priority.
One of Never, PreemptLowerPriority.
Defaults to PreemptLowerPriority if unset.

overheadoptional

Overhead represents the resource overhead associated with running a pod for a given RuntimeClass.
This field will be autopopulated at admission time by the RuntimeClass admission controller. If
the RuntimeClass admission controller is enabled, overhead must not be set in Pod create requests.
The RuntimeClass admission controller will reject Pod create requests which have the overhead already
set. If RuntimeClass is configured and selected in the PodSpec, Overhead will be set to the value
defined in the corresponding RuntimeClass, otherwise it will remain unset and treated as zero.
More info: https://git.k8s.io/enhancements/keps/sig-node/688-pod-overhead/README.md

topologySpreadConstraintsoptional

TopologySpreadConstraints describes how a group of pods ought to spread across topology
domains. Scheduler will schedule pods in a way which abides by the constraints.
All topologySpreadConstraints are ANDed.

setHostnameAsFQDNoptional

boolean

If true the pod's hostname will be configured as the pod's FQDN, rather than the leaf name (the default).
In Linux containers, this means setting the FQDN in the hostname field of the kernel (the nodename field of struct utsname).
In Windows containers, this means setting the registry value of hostname for the registry key HKEY_LOCAL_MACHINE\\SYSTEM\\CurrentControlSet\\Services\\Tcpip\\Parameters to FQDN.
If a pod does not have FQDN, this has no effect.
Default to false.

osoptional

Specifies the OS of the containers in the pod.
Some pod and container fields are restricted if this is set.

If the OS field is set to linux, the following fields must be unset:
-securityContext.windowsOptions

If the OS field is set to windows, following fields must be unset:
- spec.hostPID
- spec.hostIPC
- spec.hostUsers
- spec.securityContext.appArmorProfile
- spec.securityContext.seLinuxOptions
- spec.securityContext.seccompProfile
- spec.securityContext.fsGroup
- spec.securityContext.fsGroupChangePolicy
- spec.securityContext.sysctls
- spec.shareProcessNamespace
- spec.securityContext.runAsUser
- spec.securityContext.runAsGroup
- spec.securityContext.supplementalGroups
- spec.securityContext.supplementalGroupsPolicy
- spec.containers[*].securityContext.appArmorProfile
- spec.containers[*].securityContext.seLinuxOptions
- spec.containers[*].securityContext.seccompProfile
- spec.containers[*].securityContext.capabilities
- spec.containers[*].securityContext.readOnlyRootFilesystem
- spec.containers[*].securityContext.privileged
- spec.containers[*].securityContext.allowPrivilegeEscalation
- spec.containers[*].securityContext.procMount
- spec.containers[*].securityContext.runAsUser
- spec.containers[*].securityContext.runAsGroup

hostUsersoptional

boolean

Use the host's user namespace.
Optional: Default to true.
If set to true or not present, the pod will be run in the host user namespace, useful
for when the pod needs a feature only available to the host user namespace, such as
loading a kernel module with CAP_SYS_MODULE.
When set to false, a new userns is created for the pod. Setting false is useful for
mitigating container breakout vulnerabilities even allowing users to run their
containers as root without actually having root privileges on the host.
This field is alpha-level and is only honored by servers that enable the UserNamespacesSupport feature.

schedulingGatesoptional

SchedulingGates is an opaque list of values that if specified will block scheduling the pod.
If schedulingGates is not empty, the pod will stay in the SchedulingGated state and the
scheduler will not attempt to schedule the pod.

SchedulingGates can only be set at pod creation time, and be removed only afterwards.

resourceClaimsoptional

ResourceClaims defines which ResourceClaims must be allocated
and reserved before the Pod is allowed to start. The resources
will be made available to those containers which consume them
by name.

This is an alpha field and requires enabling the
DynamicResourceAllocation feature gate.

This field is immutable.

resourcesoptional

Resources is the total amount of CPU and Memory resources required by all
containers in the pod. It supports specifying Requests and Limits for
"cpu" and "memory" resource names only. ResourceClaims are not supported.

This field enables fine-grained control over resource allocation for the
entire pod, allowing resource sharing among containers in a pod.
TODO: For beta graduation, expand this comment with a detailed explanation.

This is an alpha field and requires enabling the PodLevelResources feature
gate.

pipelineParallelSizeoptional

integer

PipelineParallelSize defines the number of parallel workers.
It also represents the number of replicas in the worker set, where each worker set serves as a scaling unit.

tensorParallelSizeoptional

integer

TensorParallelSize specifies the number of GPUs to be used per node.
It indicates the degree of parallelism for tensor computations across the available GPUs.

storageUrioptional

string

This field points to the location of the trained model which is mounted onto the pod.

runtimeVersionoptional

string

Runtime version of the predictor docker image

protocolVersionoptional

Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)

storageoptional