Storage Containers
KServe downloads models using a storage initializer (initContainer). For example, this is the default storage initializer implementation. KServe introduced ClusterStorageContainer CRD in 0.11 which allows users to specify a custom container spec for a list of supported URI formats.
A ClusterStorageContainer defines the container spec for one or more storage URI formats. Here is an example of a ClusterStorageContainer that corresponds to the default storage initializer. Note that this is included in the helm chart.
apiVersion: "serving.kserve.io/v1alpha1"
kind: ClusterStorageContainer
metadata:
name: default
spec:
container:
name: storage-initializer
image: kserve/storage-initializer:latest
resources:
requests:
memory: 100Mi
cpu: 100m
limits:
memory: 1Gi
cpu: "1"
supportedUriFormats:
- prefix: gs://
- prefix: s3://
- prefix: hdfs://
- prefix: webhdfs://
- regex: "https://(.+?).blob.core.windows.net/(.+)"
- regex: "https://(.+?).file.core.windows.net/(.+)"
- regex: "https?://(.+)/(.+)"
- regex: "hf://"
In a ClusterStorageContainer spec, you can specify credentials for cloud storage, container resource requests and limits, and a list of supported URI formats that this image supports. KServe can match the URI either with prefix
or regex
.
spec.container.name
field must be storage-initializer
otherwise KServe can not recognize the Init Container which can cause duplicate value errors.
If a storage URI is supported by two or more ClusterStorageContainer CRs, there is no guarantee which one will be used. Please make sure that the URI format is only supported by one ClusterStorageContainer CR.
Custom Protocol Example
If you would like to use a custom protocol model-registry://
, for example, you can create a custom image and add a new ClusterStorageContainer
CR to make it available to KServe.
Create the Custom Storage Initializer
Image
The first step is to create a custom container image that will be injected into the KServe deployment, as init container, and that will be in charge to download the model.
The only requirement is that the Entrypoint
of this container image should take (and properly manage) 2 positional arguments:
- Source URI: identifies the
storageUri
set in theInferenceService
- Destination Path: the location where the model should be stored, e.g.,
/mnt/models
KServe controller will take care of properly injecting your container image and invoking it with those proper arguments.
A more concrete example can be found in Kubeflow model registry, where the storage initializer query an existing model registry
service in order to retrieve the original location of the model that the user requested to deploy.
Create the ClusterStorageContainer
CR
Once the Custom Storage Initializer image is ready, you just need to create a new ClusterStorageContainer
CR to make it available in the cluster. You just need to provide 2 essential information:
- The container spec definition, this is strictly dependent on your own custom storage initializer image.
- The supported uri formats for which your custom storage initializer should be injected, in this case just
model-registry://
.
apiVersion: "serving.kserve.io/v1alpha1"
kind: ClusterStorageContainer
metadata:
name: abc
spec:
container:
name: storage-initializer
image: kubeflow/model-registry-storage-initializer:latest
env:
- name: MODEL_REGISTRY_BASE_URL
value: "$MODEL_REGISTRY_SERVICE.model-registry.svc.cluster.local:$MODEL_REGISTRY_REST_PORT"
- name: MODEL_REGISTRY_SCHEME
value: "http"
resources:
requests:
memory: 100Mi
cpu: 100m
limits:
memory: 1Gi
cpu: "1"
supportedUriFormats:
- prefix: model-registry://
Deploy the Model with InferenceService
Create the InferenceService
with the model-registry
specific URI format.
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "iris-model"
spec:
predictor:
model:
modelFormat:
name: sklearn
storageUri: "model-registry://iris/v1"
The only assumption here is that the ML model you are going to deploy has been already registered in the Model Registry, more information can be found in the kubeflow/model-registry repository.
In this specific example the model-registry://iris/v1
model is referring to a registered model pointing to gs://kfserving-examples/models/sklearn/1.0/model
. The crucial point here is that this information needs to be provided just during the registration process, whereas during every deployment action you just need to provide the model-registry
specific URI that identifies that model (in this case model-registry://${MODEL_NAME}/${MODEL_VERSION}
).
Providing Credentials
If you need to provide secrets for cloud storage, you can specify them in the ClusterStorageContainer
spec. For example, if you are using Huggingface Hub, you can provide the necessary credentials as follows:
apiVersion: serving.kserve.io/v1alpha1
kind: ClusterStorageContainer
metadata:
name: hf-hub
spec:
container:
name: storage-initializer
image: 'kserve/storage-initializer:latest'
env:
- name: HF_TOKEN
valueFrom:
secretKeyRef:
name: hf-secret
key: HF_TOKEN
optional: false
resources:
requests:
memory: 2Gi
cpu: '1'
limits:
memory: 4Gi
cpu: '1'
supportedUriFormats:
- prefix: 'hf://'
The respective secret should be created in the same namespace as the InferenceService
that is using this ClusterStorageContainer
:
kubectl create secret generic hf-secret --from-literal=HF_TOKEN=<your_huggingface_token>
Spec Attributes
Spec attributes are in API Reference doc.