Version: 0.16

Multiple Storage URIs

This guide shows how to configure InferenceServices with multiple storage URIs, allowing you to fetch model artifacts from different storage backends and mount them at specific locations within the container.

Overview

The multiple storage URIs feature enables you to:

Fetch artifacts from multiple storage locations in a single InferenceService
Specify custom mount paths for each storage URI
Support complex model architectures like base models with adapters
Access models and preprocessing data from different sources

Use Cases

Base Models with Adapters

Store Large Language Models (LLMs) and LoRA adapters in separate locations for better versioning and reuse:

storageUris:
  - uri: hf://microsoft/DialoGPT-medium
    path: /mnt/models/base
  - uri: s3://my-bucket/lora-adapters/customer-service
    path: /mnt/models/adapters

Multiple Preprocessing Artifacts

Access models and preprocessing data from different sources:

storageUris:
  - uri: s3://bucket/trained-model
    path: /mnt/models/model
  - uri: s3://bucket/preprocessor
    path: /mnt/models/preprocessing

Before you begin

Your ~/.kube/config should point to a cluster with KServe installed.
For Knative deployments, the Knative init container feature flag must be enabled.

Basic Usage

Using Multiple URIs

Create an InferenceService resource that specifies multiple storage URIs:

Custom Paths

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: multi-storage-example
spec:
  predictor:
    model:
      modelFormat:
        name: huggingface
    storageUris:
      - uri: s3://bucket/base-model
        path: /mnt/models/base
      - uri: s3://bucket/adapters
        # Downloads to /mnt/models (default)

Apply the YAML configuration to create the InferenceService:

kubectl apply -f multi-storage.yaml

Wait for InferenceService to be in ready state:

kubectl get isvc multi-storage-example

Expected Output

You should see output similar to:

NAME            URL                                        READY   PREV   LATEST   PREVROLLEDOUTREVISION        LATESTREADYREVISION                     AGE
multi-storage-example   http://multi-storage-example.default.example.com   True           100                                   multi-storage-example-predictor-default-xxxxx   1m15s

Path Configuration

Default Behavior

If no path is specified, models download to /mnt/models:

storageUris:
  - uri: s3://bucket/model
    # Downloads to /mnt/models (default)

Custom Paths

Specify explicit mount paths for downloaded artifacts:

storageUris:
  - uri: s3://bucket/base-model
    path: /mnt/models/base
  - uri: s3://bucket/preprocessing
    path: /mnt/models/preprocessing

Path Requirements

Common Root Directory

All custom paths must share a common root directory (excluding filesystem root):

✅ Valid
❌ Invalid

# Common root: /mnt/models
storageUris:
  - uri: s3://bucket/model-a
    path: /mnt/models/model-a
  - uri: s3://bucket/model-b
    path: /mnt/models/model-b

# Different roots
storageUris:
  - uri: s3://bucket/model-a
    path: /models/model-a     # Root: /models
  - uri: s3://bucket/model-b
    path: /other/model-b      # Root: /other

Absolute Paths

All custom paths must be absolute paths:

# ✅ Valid
path: /mnt/models/custom

# ❌ Invalid
path: models/custom

Compatibility and Migration

Mutual Exclusivity

The storageUri and storageUris properties are mutually exclusive:

✅ Valid
❌ Invalid

# Use either storageUri
predictor:
  model:
    storageUri: s3://bucket/model

# OR storageUris
predictor:
  model:
    storageUris:
      - uri: s3://bucket/model

# Cannot use both
predictor:
  model:
    storageUri: s3://bucket/model
    storageUris:
      - uri: s3://bucket/other-model

Equivalent Configurations

These configurations are functionally equivalent:

Legacy Single URI
New with Default Path
New with Explicit Path

storageUri: s3://bucket/model

storageUris:
  - uri: s3://bucket/model

storageUris:
  - uri: s3://bucket/model
    path: /mnt/models

Supported Storage Types

Multiple storage URIs support all existing storage providers supported by KServe.

File Conflicts and Resolution

Avoiding Conflicts

When multiple URIs download to the same path, files may overwrite each other non-deterministically. Users are expected to manage the conflicts by manually specifying the path:

⚠️ Conflict-Prone
✅ Conflict-Free

# Both contain model.pt - one will overwrite the other
storageUris:
  - uri: s3://bucket/model-a  # Contains model.pt
    path: /mnt/models
  - uri: s3://bucket/model-b  # Contains model.pt
    path: /mnt/models

Result:

/mnt/models
└── model.pt  # Undefined which URI this came from

# Separate paths prevent conflicts
storageUris:
  - uri: s3://bucket/model-a
    path: /mnt/models/model-a
  - uri: s3://bucket/model-b
    path: /mnt/models/model-b

Result:

/mnt/models
├── model-a
│   └── model.pt
└── model-b
    └── model.pt

Directory Merging

When downloading to the same path without filename conflicts, directories merge successfully:

storageUris:
  - uri: s3://bucket/model-a     # Contains model.pt
  - uri: s3://bucket/preprocessing  # Contains preprocessing.csv

Result:

/mnt/models
├── model.pt
└── preprocessing.csv

Component Support

Multiple storage URIs are available on predictor, transformer, and explainer components:

Predictor
Transformer
Explainer

spec:
  predictor:
    storageUris:
      - uri: s3://bucket/predictor-model
        path: /mnt/models/predictor

spec:
  transformer:
    storageUris:
      - uri: s3://bucket/transformer-model
        path: /mnt/models/transformer

spec:
  explainer:
    storageUris:
      - uri: s3://bucket/explainer-model
        path: /mnt/models/explainer

Run a Prediction

Once your InferenceService is deployed, you can run predictions as usual. The model containers will have access to all configured storage artifacts at their specified mount paths.

Determine the ingress IP and port by following this instruction.

SERVICE_HOSTNAME=$(kubectl get inferenceservice multi-storage-example -o jsonpath='{.status.url}' | cut -d "/" -f 3)

MODEL_NAME=multi-storage-example
curl -v -H "Host: ${SERVICE_HOSTNAME}" \
     -H "Content-Type: application/json" \
     http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict \
     -d @input.json

Limitations

Environment Variable Support

Multiple storage URIs using the STORAGE_URI environment variable is not supported. The STORAGE_URI environment variable exists for legacy transformer compatibility but is no longer needed with the storageUris property.

Path Validation

The system validates paths to prevent directory traversal attacks and ensure all paths share a common root for security and operational consistency.

Overview​

Use Cases​

Base Models with Adapters​

Multiple Preprocessing Artifacts​

Before you begin​

Basic Usage​

Using Multiple URIs​

Path Configuration​

Default Behavior​

Custom Paths​

Path Requirements​

Common Root Directory​

Absolute Paths​

Compatibility and Migration​

Mutual Exclusivity​

Equivalent Configurations​

Supported Storage Types​

File Conflicts and Resolution​

Avoiding Conflicts​

Directory Merging​

Component Support​

Run a Prediction​

Limitations​

Environment Variable Support​

Path Validation​

Overview

Use Cases

Base Models with Adapters

Multiple Preprocessing Artifacts

Before you begin

Basic Usage

Using Multiple URIs

Path Configuration

Default Behavior

Custom Paths

Path Requirements

Common Root Directory

Absolute Paths

Compatibility and Migration

Mutual Exclusivity

Equivalent Configurations

Supported Storage Types

File Conflicts and Resolution

Avoiding Conflicts

Directory Merging

Component Support

Run a Prediction

Limitations

Environment Variable Support

Path Validation