Control Plane API

Packages:

serving.kserve.io/v1beta1

serving.kserve.io/v1beta1

Package v1beta1 contains API Schema definitions for the serving v1beta1 API group

Resource Types:

AIXExplainerSpec

(Appears on:ExplainerSpec)

AIXExplainerSpec defines the arguments for configuring an AIX Explanation Server

Field Description

type
AIXExplainerType

The type of AIX explainer

ExplainerExtensionSpec
ExplainerExtensionSpec

(Members of ExplainerExtensionSpec are embedded into this type.)

Contains fields shared across all explainers

AIXExplainerType (`string` alias)

(Appears on:AIXExplainerSpec)

Value	Description
"LimeImages"

ARTExplainerSpec

(Appears on:ExplainerSpec)

ARTExplainerType defines the arguments for configuring an ART Explanation Server

Field Description

type
ARTExplainerType

The type of ART explainer

ExplainerExtensionSpec
ExplainerExtensionSpec

(Members of ExplainerExtensionSpec are embedded into this type.)

Contains fields shared across all explainers

ARTExplainerType (`string` alias)

(Appears on:ARTExplainerSpec)

Value	Description
"SquareAttack"

AlibiExplainerSpec

(Appears on:ExplainerSpec)

AlibiExplainerSpec defines the arguments for configuring an Alibi Explanation Server

Field Description

type
AlibiExplainerType

The type of Alibi explainer
Valid values are:
- “AnchorTabular”;
- “AnchorImages”;
- “AnchorText”;
- “Counterfactuals”;
- “Contrastive”;

ExplainerExtensionSpec
ExplainerExtensionSpec

(Members of ExplainerExtensionSpec are embedded into this type.)

Contains fields shared across all explainers

AlibiExplainerType (`string` alias)

(Appears on:AlibiExplainerSpec)

AlibiExplainerType is the explanation method

Value	Description
"AnchorImages"
"AnchorTabular"
"AnchorText"
"Contrastive"
"Counterfactuals"

Batcher

(Appears on:ComponentExtensionSpec)

Batcher specifies optional payload batching available for all components

Field	Description
`maxBatchSize` int	(Optional) Specifies the max number of requests to trigger a batch
`maxLatency` int	(Optional) Specifies the max latency to trigger a batch
`timeout` int	(Optional) Specifies the timeout of a batch

Component

Component interface is implemented by all specs that contain component implementations, e.g. PredictorSpec, ExplainerSpec, TransformerSpec.

ComponentExtensionSpec

(Appears on:ExplainerSpec, PredictorSpec, TransformerSpec)

ComponentExtensionSpec defines the deployment configuration for a given InferenceService component

Field	Description
`minReplicas` int	(Optional) Minimum number of replicas, defaults to 1 but can be set to 0 to enable scale-to-zero.
`maxReplicas` int	(Optional) Maximum number of replicas for autoscaling.
`scaleTarget` int	(Optional) ScaleTarget specifies the integer target value of the metric type the Autoscaler watches for. concurrency and rps targets are supported by Knative Pod Autoscaler (https://knative.dev/docs/serving/autoscaling/autoscaling-targets/).
`scaleMetric` ScaleMetric	(Optional) ScaleMetric defines the scaling metric type watched by autoscaler possible values are concurrency, rps, cpu, memory. concurrency, rps are supported via Knative Pod Autoscaler(https://knative.dev/docs/serving/autoscaling/autoscaling-metrics).
`containerConcurrency` int64	(Optional) ContainerConcurrency specifies how many requests can be processed concurrently, this sets the hard limit of the container concurrency(https://knative.dev/docs/serving/autoscaling/concurrency).
`timeout` int64	(Optional) TimeoutSeconds specifies the number of seconds to wait before timing out a request to the component.
`canaryTrafficPercent` int64	(Optional) CanaryTrafficPercent defines the traffic split percentage between the candidate revision and the last ready revision
`logger` LoggerSpec	(Optional) Activate request/response logging and logger configurations
`batcher` Batcher	(Optional) Activate request batching and batching configurations

ComponentImplementation

ComponentImplementation interface is implemented by predictor, transformer, and explainer implementations

ComponentStatusSpec

(Appears on:InferenceServiceStatus)

ComponentStatusSpec describes the state of the component

Field	Description
`latestReadyRevision` string	(Optional) Latest revision name that is in ready state
`latestCreatedRevision` string	(Optional) Latest revision name that is created
`previousRolledoutRevision` string	(Optional) Previous revision name that is rolled out with 100 percent traffic
`latestRolledoutRevision` string	(Optional) Latest revision name that is rolled out with 100 percent traffic
`traffic` []knative.dev/serving/pkg/apis/serving/v1.TrafficTarget	(Optional) Traffic holds the configured traffic distribution for latest ready revision and previous rolled out revision.
`url` knative.dev/pkg/apis.URL	(Optional) URL holds the primary url that will distribute traffic over the provided traffic targets. This will be one the REST or gRPC endpoints that are available. It generally has the form http[s]://{route-name}.{route-namespace}.{cluster-level-suffix}
`restUrl` knative.dev/pkg/apis.URL	(Optional) REST endpoint of the component if available.
`grpcUrl` knative.dev/pkg/apis.URL	(Optional) gRPC endpoint of the component if available.
`address` knative.dev/pkg/apis/duck/v1.Addressable	(Optional) Addressable endpoint for the InferenceService

ComponentType (`string` alias)

ComponentType contains the different types of components of the service

Value	Description
"explainer"
"predictor"
"transformer"

CustomExplainer

CustomExplainer defines arguments for configuring a custom explainer.

Field	Description
`PodSpec` Kubernetes core/v1.PodSpec	(Members of `PodSpec` are embedded into this type.)

CustomPredictor

CustomPredictor defines arguments for configuring a custom server.

Field	Description
`PodSpec` Kubernetes core/v1.PodSpec	(Members of `PodSpec` are embedded into this type.)

CustomTransformer

CustomTransformer defines arguments for configuring a custom transformer.

Field	Description
`PodSpec` Kubernetes core/v1.PodSpec	(Members of `PodSpec` are embedded into this type.)

DeployConfig

Field	Description
`defaultDeploymentMode` string

ExplainerConfig

(Appears on:ExplainersConfig)

Field	Description
`image` string	explainer docker image name
`defaultImageVersion` string	default explainer docker image version

ExplainerExtensionSpec

(Appears on:AIXExplainerSpec, ARTExplainerSpec, AlibiExplainerSpec)

ExplainerExtensionSpec defines configuration shared across all explainer frameworks

Field	Description
`storageUri` string	The location of a trained explanation model
`runtimeVersion` string	Defaults to latest Explainer Version
`config` map[string]string	Inline custom parameter settings for explainer
`Container` Kubernetes core/v1.Container	(Members of `Container` are embedded into this type.) (Optional) Container enables overrides for the predictor. Each framework will have different defaults that are populated in the underlying container spec.
`storage` StorageSpec	(Optional) Storage Spec for model location

ExplainerSpec

(Appears on:InferenceServiceSpec)

ExplainerSpec defines the container spec for a model explanation server, The following fields follow a “1-of” semantic. Users must specify exactly one spec.

Field	Description
`alibi` AlibiExplainerSpec	Spec for alibi explainer
`aix` AIXExplainerSpec	Spec for AIX explainer
`art` ARTExplainerSpec	Spec for ART explainer
`PodSpec` PodSpec	(Members of `PodSpec` are embedded into this type.) This spec is dual purpose. 1) Users may choose to provide a full PodSpec for their custom explainer. The field PodSpec.Containers is mutually exclusive with other explainers (i.e. Alibi). 2) Users may choose to provide a Explainer (i.e. Alibi) and specify PodSpec overrides in the PodSpec. They must not provide PodSpec.Containers in this case.
`ComponentExtensionSpec` ComponentExtensionSpec	(Members of `ComponentExtensionSpec` are embedded into this type.) Component extension defines the deployment configurations for explainer

ExplainersConfig

(Appears on:InferenceServicesConfig)

Field	Description
`alibi` ExplainerConfig
`aix` ExplainerConfig
`art` ExplainerConfig

FailureInfo

(Appears on:ModelStatus)

Field	Description
`location` string	(Optional) Name of component to which the failure relates (usually Pod name)
`reason` FailureReason	(Optional) High level class of failure
`message` string	(Optional) Detailed error message
`modelRevisionName` string	(Optional) Internal Revision/ID of model, tied to specific Spec contents
`time` Kubernetes meta/v1.Time	(Optional) Time failure occurred or was discovered

FailureReason (`string` alias)

(Appears on:FailureInfo)

FailureReason enum

Value	Description
"InvalidPredictorSpec"	The current Predictor Spec is invalid or unsupported
"ModelLoadFailed"	The model failed to load within a ServingRuntime container
"NoSupportingRuntime"	There are no ServingRuntime which support the specified model type
"RuntimeDisabled"	The ServingRuntime is disabled
"RuntimeNotRecognized"	There is no ServingRuntime defined with the specified runtime name
"RuntimeUnhealthy"	Corresponding ServingRuntime containers failed to start or are unhealthy

InferenceService

InferenceService is the Schema for the InferenceServices API

Field Description

metadata
Kubernetes meta/v1.ObjectMeta Refer to the Kubernetes API documentation for the fields of the metadata field.

spec
InferenceServiceSpec

`predictor` PredictorSpec	Predictor defines the model serving spec
`explainer` ExplainerSpec	(Optional) Explainer defines the model explanation service spec, explainer service calls to predictor or transformer if it is specified.
`transformer` TransformerSpec	(Optional) Transformer defines the pre/post processing before and after the predictor call, transformer service calls to predictor service.

status
InferenceServiceStatus

InferenceServiceSpec

(Appears on:InferenceService)

InferenceServiceSpec is the top level type for this resource

Field	Description
`predictor` PredictorSpec	Predictor defines the model serving spec
`explainer` ExplainerSpec	(Optional) Explainer defines the model explanation service spec, explainer service calls to predictor or transformer if it is specified.
`transformer` TransformerSpec	(Optional) Transformer defines the pre/post processing before and after the predictor call, transformer service calls to predictor service.

InferenceServiceStatus

(Appears on:InferenceService)

InferenceServiceStatus defines the observed state of InferenceService

Field	Description
`Status` knative.dev/pkg/apis/duck/v1.Status	(Members of `Status` are embedded into this type.) Conditions for the InferenceService - PredictorReady: predictor readiness condition; - TransformerReady: transformer readiness condition; - ExplainerReady: explainer readiness condition; - RoutesReady: aggregated routing condition; - Ready: aggregated condition;
`address` knative.dev/pkg/apis/duck/v1.Addressable	(Optional) Addressable endpoint for the InferenceService
`url` knative.dev/pkg/apis.URL	(Optional) URL holds the url that will distribute traffic over the provided traffic targets. It generally has the form http[s]://{route-name}.{route-namespace}.{cluster-level-suffix}
`components` map[kserve.io/v1beta1/pkg/apis/serving/v1beta1.ComponentType]kserve.io/v1beta1/pkg/apis/serving/v1beta1.ComponentStatusSpec	Statuses for the components of the InferenceService
`modelStatus` ModelStatus	Model related statuses

InferenceServicesConfig

Field	Description
`transformers` TransformersConfig	Transformer configurations
`predictors` PredictorsConfig	Predictor configurations
`explainers` ExplainersConfig	Explainer configurations

IngressConfig

Field	Description
`ingressGateway` string
`ingressService` string
`localGateway` string
`localGatewayService` string
`ingressDomain` string
`ingressClassName` string
`domainTemplate` string
`urlScheme` string

LightGBMSpec

(Appears on:PredictorSpec)

LightGBMSpec defines arguments for configuring LightGBMSpec model serving.

Field Description

PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

LoggerSpec

(Appears on:ComponentExtensionSpec)

LoggerSpec specifies optional payload logging available for all components

Field	Description
`url` string	(Optional) URL to send logging events
`mode` LoggerType	(Optional) Specifies the scope of the loggers. Valid values are: - “all” (default): log both request and response; - “request”: log only request; - “response”: log only response

LoggerType (`string` alias)

(Appears on:LoggerSpec)

LoggerType controls the scope of log publishing

Value	Description
"all"	Logger mode to log both request and response
"request"	Logger mode to log only request
"response"	Logger mode to log only response

ModelCopies

(Appears on:ModelStatus)

Field	Description
`failedCopies` int	How many copies of this predictor’s models failed to load recently
`totalCopies` int	(Optional) Total number copies of this predictor’s models that are currently loaded

ModelFormat

(Appears on:ModelSpec)

Field	Description
`name` string	Name of the model format.
`version` string	(Optional) Version of the model format. Used in validating that a predictor is supported by a runtime. Can be “major”, “major.minor” or “major.minor.patch”.

ModelRevisionStates

(Appears on:ModelStatus)

Field	Description
`activeModelState` ModelState	High level state string: Pending, Standby, Loading, Loaded, FailedToLoad
`targetModelState` ModelState

ModelSpec

(Appears on:PredictorSpec)

Field	Description
`modelFormat` ModelFormat	ModelFormat being served.
`runtime` string	(Optional) Specific ClusterServingRuntime/ServingRuntime name to use for deployment.
`PredictorExtensionSpec` PredictorExtensionSpec	(Members of `PredictorExtensionSpec` are embedded into this type.)

ModelState (`string` alias)

(Appears on:ModelRevisionStates)

ModelState enum

Value	Description
"FailedToLoad"	All copies of the model failed to load
"Loaded"	At least one copy of the model is loaded
"Loading"	Model is loading
"Pending"	Model is not yet registered
"Standby"	Model is available but not loaded (will load when used)

ModelStatus

(Appears on:InferenceServiceStatus)

Field	Description
`transitionStatus` TransitionStatus	Whether the available predictor endpoints reflect the current Spec or is in transition
`states` ModelRevisionStates	(Optional) State information of the predictor’s model.
`lastFailureInfo` FailureInfo	(Optional) Details of last failure, when load of target model is failed or blocked.
`copies` ModelCopies	(Optional) Model copy information of the predictor’s model.

ONNXRuntimeSpec

(Appears on:PredictorSpec)

ONNXRuntimeSpec defines arguments for configuring ONNX model serving.

Field Description

PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

PMMLSpec

(Appears on:PredictorSpec)

PMMLSpec defines arguments for configuring PMML model serving.

Field Description

PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

PaddleServerSpec

(Appears on:PredictorSpec)

Field	Description
`PredictorExtensionSpec` PredictorExtensionSpec	(Members of `PredictorExtensionSpec` are embedded into this type.)

PodSpec

(Appears on:ExplainerSpec, PredictorSpec, TransformerSpec)

PodSpec is a description of a pod.

Field	Description
`volumes` []Kubernetes core/v1.Volume	(Optional) List of volumes that can be mounted by containers belonging to the pod. More info: https://kubernetes.io/docs/concepts/storage/volumes
`initContainers` []Kubernetes core/v1.Container	List of initialization containers belonging to the pod. Init containers are executed in order prior to containers being started. If any init container fails, the pod is considered to have failed and is handled according to its restartPolicy. The name for an init container or normal container must be unique among all containers. Init containers may not have Lifecycle actions, Readiness probes, Liveness probes, or Startup probes. The resourceRequirements of an init container are taken into account during scheduling by finding the highest request/limit for each resource type, and then using the max of of that value or the sum of the normal containers. Limits are applied to init containers in a similar fashion. Init containers cannot currently be added or removed. Cannot be updated. More info: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
`containers` []Kubernetes core/v1.Container	List of containers belonging to the pod. Containers cannot currently be added or removed. There must be at least one container in a Pod. Cannot be updated.
`ephemeralContainers` []Kubernetes core/v1.EphemeralContainer	(Optional) List of ephemeral containers run in this pod. Ephemeral containers may be run in an existing pod to perform user-initiated actions such as debugging. This list cannot be specified when creating a pod, and it cannot be modified by updating the pod spec. In order to add an ephemeral container to an existing pod, use the pod’s ephemeralcontainers subresource. This field is alpha-level and is only honored by servers that enable the EphemeralContainers feature.
`restartPolicy` Kubernetes core/v1.RestartPolicy	(Optional) Restart policy for all containers within the pod. One of Always, OnFailure, Never. Default to Always. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy
`terminationGracePeriodSeconds` int64	(Optional) Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request. Value must be non-negative integer. The value zero indicates delete immediately. If this value is nil, the default grace period will be used instead. The grace period is the duration in seconds after the processes running in the pod are sent a termination signal and the time when the processes are forcibly halted with a kill signal. Set this value longer than the expected cleanup time for your process. Defaults to 30 seconds.
`activeDeadlineSeconds` int64	(Optional) Optional duration in seconds the pod may be active on the node relative to StartTime before the system will actively try to mark it failed and kill associated containers. Value must be a positive integer.
`dnsPolicy` Kubernetes core/v1.DNSPolicy	(Optional) Set DNS policy for the pod. Defaults to “ClusterFirst”. Valid values are ‘ClusterFirstWithHostNet’, ‘ClusterFirst’, ‘Default’ or ‘None’. DNS parameters given in DNSConfig will be merged with the policy selected with DNSPolicy. To have DNS options set along with hostNetwork, you have to specify DNS policy explicitly to ‘ClusterFirstWithHostNet’.
`nodeSelector` map[string]string	(Optional) NodeSelector is a selector which must be true for the pod to fit on a node. Selector which must match a node’s labels for the pod to be scheduled on that node. More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
`serviceAccountName` string	(Optional) ServiceAccountName is the name of the ServiceAccount to use to run this pod. More info: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/
`serviceAccount` string	(Optional) DeprecatedServiceAccount is a depreciated alias for ServiceAccountName. Deprecated: Use serviceAccountName instead.
`automountServiceAccountToken` bool	(Optional) AutomountServiceAccountToken indicates whether a service account token should be automatically mounted.
`nodeName` string	(Optional) NodeName is a request to schedule this pod onto a specific node. If it is non-empty, the scheduler simply schedules this pod onto that node, assuming that it fits resource requirements.
`hostNetwork` bool	(Optional) Host networking requested for this pod. Use the host’s network namespace. If this option is set, the ports that will be used must be specified. Default to false.
`hostPID` bool	(Optional) Use the host’s pid namespace. Optional: Default to false.
`hostIPC` bool	(Optional) Use the host’s ipc namespace. Optional: Default to false.
`shareProcessNamespace` bool	(Optional) Share a single process namespace between all of the containers in a pod. When this is set containers will be able to view and signal processes from other containers in the same pod, and the first process in each container will not be assigned PID 1. HostPID and ShareProcessNamespace cannot both be set. Optional: Default to false.
`securityContext` Kubernetes core/v1.PodSecurityContext	(Optional) SecurityContext holds pod-level security attributes and common container settings. Optional: Defaults to empty. See type description for default values of each field.
`imagePullSecrets` []Kubernetes core/v1.LocalObjectReference	(Optional) ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec. If specified, these secrets will be passed to individual puller implementations for them to use. For example, in the case of docker, only DockerConfig type secrets are honored. More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod
`hostname` string	(Optional) Specifies the hostname of the Pod If not specified, the pod’s hostname will be set to a system-defined value.
`subdomain` string	(Optional) If specified, the fully qualified Pod hostname will be “...svc.”. If not specified, the pod will not have a domainname at all.
`affinity` Kubernetes core/v1.Affinity	(Optional) If specified, the pod’s scheduling constraints
`schedulerName` string	(Optional) If specified, the pod will be dispatched by specified scheduler. If not specified, the pod will be dispatched by default scheduler.
`tolerations` []Kubernetes core/v1.Toleration	(Optional) If specified, the pod’s tolerations.
`hostAliases` []Kubernetes core/v1.HostAlias	(Optional) HostAliases is an optional list of hosts and IPs that will be injected into the pod’s hosts file if specified. This is only valid for non-hostNetwork pods.
`priorityClassName` string	(Optional) If specified, indicates the pod’s priority. “system-node-critical” and “system-cluster-critical” are two special keywords which indicate the highest priorities with the former being the highest priority. Any other name must be defined by creating a PriorityClass object with that name. If not specified, the pod priority will be default or zero if there is no default.
`priority` int32	(Optional) The priority value. Various system components use this field to find the priority of the pod. When Priority Admission Controller is enabled, it prevents users from setting this field. The admission controller populates this field from PriorityClassName. The higher the value, the higher the priority.
`dnsConfig` Kubernetes core/v1.PodDNSConfig	(Optional) Specifies the DNS parameters of a pod. Parameters specified here will be merged to the generated DNS configuration based on DNSPolicy.
`readinessGates` []Kubernetes core/v1.PodReadinessGate	(Optional) If specified, all readiness gates will be evaluated for pod readiness. A pod is ready when all its containers are ready AND all conditions specified in the readiness gates have status equal to “True” More info: https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/580-pod-readiness-gates
`runtimeClassName` string	(Optional) RuntimeClassName refers to a RuntimeClass object in the node.k8s.io group, which should be used to run this pod. If no RuntimeClass resource matches the named class, the pod will not be run. If unset or empty, the “legacy” RuntimeClass will be used, which is an implicit class with an empty definition that uses the default runtime handler. More info: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/585-runtime-class This is a beta feature as of Kubernetes v1.14.
`enableServiceLinks` bool	(Optional) EnableServiceLinks indicates whether information about services should be injected into pod’s environment variables, matching the syntax of Docker links. Optional: Defaults to true.
`preemptionPolicy` Kubernetes core/v1.PreemptionPolicy	(Optional) PreemptionPolicy is the Policy for preempting pods with lower priority. One of Never, PreemptLowerPriority. Defaults to PreemptLowerPriority if unset. This field is beta-level, gated by the NonPreemptingPriority feature-gate.
`overhead` Kubernetes core/v1.ResourceList	(Optional) Overhead represents the resource overhead associated with running a pod for a given RuntimeClass. This field will be autopopulated at admission time by the RuntimeClass admission controller. If the RuntimeClass admission controller is enabled, overhead must not be set in Pod create requests. The RuntimeClass admission controller will reject Pod create requests which have the overhead already set. If RuntimeClass is configured and selected in the PodSpec, Overhead will be set to the value defined in the corresponding RuntimeClass, otherwise it will remain unset and treated as zero. More info: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/688-pod-overhead This field is alpha-level as of Kubernetes v1.16, and is only honored by servers that enable the PodOverhead feature.
`topologySpreadConstraints` []Kubernetes core/v1.TopologySpreadConstraint	(Optional) TopologySpreadConstraints describes how a group of pods ought to spread across topology domains. Scheduler will schedule pods in a way which abides by the constraints. All topologySpreadConstraints are ANDed.
`setHostnameAsFQDN` bool	(Optional) If true the pod’s hostname will be configured as the pod’s FQDN, rather than the leaf name (the default). In Linux containers, this means setting the FQDN in the hostname field of the kernel (the nodename field of struct utsname). In Windows containers, this means setting the registry value of hostname for the registry key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters to FQDN. If a pod does not have FQDN, this has no effect. Default to false.

PredictorConfig

(Appears on:PredictorProtocols, PredictorsConfig)

Field	Description
`image` string	predictor docker image name
`defaultImageVersion` string	default predictor docker image version on cpu
`defaultGpuImageVersion` string	default predictor docker image version on gpu
`defaultTimeout,string` int64	Default timeout of predictor for serving a request, in seconds
`multiModelServer,boolean` bool	Flag to determine if multi-model serving is supported
`supportedFrameworks` []string	frameworks the model agent is able to run

PredictorExtensionSpec

(Appears on:LightGBMSpec, ModelSpec, ONNXRuntimeSpec, PMMLSpec, PaddleServerSpec, SKLearnSpec, TFServingSpec, TorchServeSpec, TritonSpec, XGBoostSpec)

PredictorExtensionSpec defines configuration shared across all predictor frameworks

Field	Description
`storageUri` string	(Optional) This field points to the location of the trained model which is mounted onto the pod.
`runtimeVersion` string	(Optional) Runtime version of the predictor docker image
`protocolVersion` github.com/kserve/kserve/pkg/constants.InferenceServiceProtocol	(Optional) Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)
`Container` Kubernetes core/v1.Container	(Members of `Container` are embedded into this type.) (Optional) Container enables overrides for the predictor. Each framework will have different defaults that are populated in the underlying container spec.
`storage` StorageSpec	(Optional) Storage Spec for model location

PredictorImplementation

PredictorImplementation defines common functions for all predictors e.g Tensorflow, Triton, etc

PredictorProtocols

(Appears on:PredictorsConfig)

Field	Description
`v1` PredictorConfig
`v2` PredictorConfig

PredictorSpec

(Appears on:InferenceServiceSpec)

PredictorSpec defines the configuration for a predictor, The following fields follow a “1-of” semantic. Users must specify exactly one spec.

Field	Description
`sklearn` SKLearnSpec	Spec for SKLearn model server
`xgboost` XGBoostSpec	Spec for XGBoost model server
`tensorflow` TFServingSpec	Spec for TFServing (https://github.com/tensorflow/serving)
`pytorch` TorchServeSpec	Spec for TorchServe (https://pytorch.org/serve)
`triton` TritonSpec	Spec for Triton Inference Server (https://github.com/triton-inference-server/server)
`onnx` ONNXRuntimeSpec	Spec for ONNX runtime (https://github.com/microsoft/onnxruntime)
`pmml` PMMLSpec	Spec for PMML (http://dmg.org/pmml/v4-1/GeneralStructure.html)
`lightgbm` LightGBMSpec	Spec for LightGBM model server
`paddle` PaddleServerSpec	Spec for Paddle model server (https://github.com/PaddlePaddle/Serving)
`model` ModelSpec	Model spec for any arbitrary framework.
`PodSpec` PodSpec	(Members of `PodSpec` are embedded into this type.) This spec is dual purpose. 1) Provide a full PodSpec for custom predictor. The field PodSpec.Containers is mutually exclusive with other predictors (i.e. TFServing). 2) Provide a predictor (i.e. TFServing) and specify PodSpec overrides, you must not provide PodSpec.Containers in this case.
`ComponentExtensionSpec` ComponentExtensionSpec	(Members of `ComponentExtensionSpec` are embedded into this type.) Component extension defines the deployment configurations for a predictor

PredictorsConfig

(Appears on:InferenceServicesConfig)

Field	Description
`tensorflow` PredictorConfig
`triton` PredictorConfig
`xgboost` PredictorProtocols
`sklearn` PredictorProtocols
`pytorch` PredictorConfig
`onnx` PredictorConfig
`pmml` PredictorConfig
`lightgbm` PredictorConfig
`paddle` PredictorConfig

SKLearnSpec

(Appears on:PredictorSpec)

SKLearnSpec defines arguments for configuring SKLearn model serving.

Field Description

PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

ScaleMetric (`string` alias)

(Appears on:ComponentExtensionSpec)

ScaleMetric enum

Value	Description
"cpu"
"concurrency"
"memory"
"rps"

StorageSpec

(Appears on:ExplainerExtensionSpec, PredictorExtensionSpec)

Field	Description
`path` string	(Optional) The path to the model object in the storage. It cannot co-exist with the storageURI.
`schemaPath` string	(Optional) The path to the model schema file in the storage.
`parameters` map[string]string	(Optional) Parameters to override the default storage credentials and config.
`key` string	(Optional) The Storage Key in the secret for this model.

TFServingSpec

(Appears on:PredictorSpec)

TFServingSpec defines arguments for configuring Tensorflow model serving.

Field Description

PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

TorchServeSpec

(Appears on:PredictorSpec)

TorchServeSpec defines arguments for configuring PyTorch model serving.

Field Description

PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

TransformerConfig

(Appears on:TransformersConfig)

Field	Description
`image` string	transformer docker image name
`defaultImageVersion` string	default transformer docker image version

TransformerSpec

(Appears on:InferenceServiceSpec)

TransformerSpec defines transformer service for pre/post processing

Field Description

PodSpec
PodSpec

(Members of PodSpec are embedded into this type.)

This spec is dual purpose.
1) Provide a full PodSpec for custom transformer. The field PodSpec.Containers is mutually exclusive with other transformers.
2) Provide a transformer and specify PodSpec overrides, you must not provide PodSpec.Containers in this case.

ComponentExtensionSpec
ComponentExtensionSpec

(Members of ComponentExtensionSpec are embedded into this type.)

Component extension defines the deployment configurations for a transformer

TransformersConfig

(Appears on:InferenceServicesConfig)

Field	Description
`feast` TransformerConfig

TransitionStatus (`string` alias)

(Appears on:ModelStatus)

TransitionStatus enum

Value	Description
"BlockedByFailedLoad"	Target model failed to load
"InProgress"	Waiting for target model to reach state of active model
"InvalidSpec"	Target predictor spec failed validation
"UpToDate"	Predictor is up-to-date (reflects current spec)

TritonSpec

(Appears on:PredictorSpec)

TritonSpec defines arguments for configuring Triton model serving.

Field Description

PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

XGBoostSpec

(Appears on:PredictorSpec)

XGBoostSpec defines arguments for configuring XGBoost model serving.

Field Description

PredictorExtensionSpec
PredictorExtensionSpec

(Members of PredictorExtensionSpec are embedded into this type.)

Contains fields shared across all predictors

Generated with gen-crd-api-reference-docs on git commit 133ecebb.

Control Plane API

serving.kserve.io/v1beta1

AIXExplainerSpec

AIXExplainerType (string alias)

ARTExplainerSpec

ARTExplainerType (string alias)

AlibiExplainerSpec

AlibiExplainerType (string alias)

Batcher

Component

ComponentExtensionSpec

ComponentImplementation

ComponentStatusSpec

ComponentType (string alias)

CustomExplainer

CustomPredictor

CustomTransformer

DeployConfig

ExplainerConfig

ExplainerExtensionSpec

ExplainerSpec

ExplainersConfig

FailureInfo

FailureReason (string alias)

InferenceService

InferenceServiceSpec

InferenceServiceStatus

InferenceServicesConfig

IngressConfig

LightGBMSpec

LoggerSpec

LoggerType (string alias)

ModelCopies

ModelFormat

ModelRevisionStates

ModelSpec

ModelState (string alias)

ModelStatus

ONNXRuntimeSpec

PMMLSpec

PaddleServerSpec

PodSpec

PredictorConfig

PredictorExtensionSpec

PredictorImplementation

PredictorProtocols

PredictorSpec

PredictorsConfig

SKLearnSpec

ScaleMetric (string alias)

StorageSpec

TFServingSpec

TorchServeSpec

TransformerConfig

TransformerSpec

TransformersConfig

TransitionStatus (string alias)

TritonSpec

XGBoostSpec

AIXExplainerType (`string` alias)

ARTExplainerType (`string` alias)

AlibiExplainerType (`string` alias)

ComponentType (`string` alias)

FailureReason (`string` alias)

LoggerType (`string` alias)

ModelState (`string` alias)

ScaleMetric (`string` alias)

TransitionStatus (`string` alias)