Skip to main content

Multiple Storage URIs

This guide shows how to configure InferenceServices with multiple storage URIs, allowing you to fetch model artifacts from different storage backends and mount them at specific locations within the container.

Overview

The multiple storage URIs feature enables you to:

  • Fetch artifacts from multiple storage locations in a single InferenceService
  • Specify custom mount paths for each storage URI
  • Support complex model architectures like base models with adapters
  • Access models and preprocessing data from different sources

Use Cases

Base Models with Adapters

Store Large Language Models (LLMs) and LoRA adapters in separate locations for better versioning and reuse:

storageUris:
- uri: hf://microsoft/DialoGPT-medium
path: /mnt/models/base
- uri: s3://my-bucket/lora-adapters/customer-service
path: /mnt/models/adapters

Multiple Preprocessing Artifacts

Access models and preprocessing data from different sources:

storageUris:
- uri: s3://bucket/trained-model
path: /mnt/models/model
- uri: s3://bucket/preprocessor
path: /mnt/models/preprocessing

Before you begin

  1. Your ~/.kube/config should point to a cluster with KServe installed.
  2. For Knative deployments, the Knative init container feature flag must be enabled.

Basic Usage

Using Multiple URIs

Create an InferenceService resource that specifies multiple storage URIs:

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: multi-storage-example
spec:
predictor:
model:
modelFormat:
name: huggingface
storageUris:
- uri: s3://bucket/base-model
path: /mnt/models/base
- uri: s3://bucket/adapters
# Downloads to /mnt/models (default)

Apply the YAML configuration to create the InferenceService:

kubectl apply -f multi-storage.yaml

Wait for InferenceService to be in ready state:

kubectl get isvc multi-storage-example
Expected Output

You should see output similar to:

NAME            URL                                        READY   PREV   LATEST   PREVROLLEDOUTREVISION        LATESTREADYREVISION                     AGE
multi-storage-example http://multi-storage-example.default.example.com True 100 multi-storage-example-predictor-default-xxxxx 1m15s

Path Configuration

Default Behavior

If no path is specified, models download to /mnt/models:

storageUris:
- uri: s3://bucket/model
# Downloads to /mnt/models (default)

Custom Paths

Specify explicit mount paths for downloaded artifacts:

storageUris:
- uri: s3://bucket/base-model
path: /mnt/models/base
- uri: s3://bucket/preprocessing
path: /mnt/models/preprocessing

Path Requirements

Common Root Directory

All custom paths must share a common root directory (excluding filesystem root):

# Common root: /mnt/models
storageUris:
- uri: s3://bucket/model-a
path: /mnt/models/model-a
- uri: s3://bucket/model-b
path: /mnt/models/model-b

Absolute Paths

All custom paths must be absolute paths:

# ✅ Valid
path: /mnt/models/custom

# ❌ Invalid
path: models/custom

Compatibility and Migration

Mutual Exclusivity

The storageUri and storageUris properties are mutually exclusive:

# Use either storageUri
predictor:
model:
storageUri: s3://bucket/model

# OR storageUris
predictor:
model:
storageUris:
- uri: s3://bucket/model

Equivalent Configurations

These configurations are functionally equivalent:

storageUri: s3://bucket/model

Supported Storage Types

Multiple storage URIs support all existing storage providers supported by KServe.

File Conflicts and Resolution

Avoiding Conflicts

When multiple URIs download to the same path, files may overwrite each other non-deterministically. Users are expected to manage the conflicts by manually specifying the path:

# Both contain model.pt - one will overwrite the other
storageUris:
- uri: s3://bucket/model-a # Contains model.pt
path: /mnt/models
- uri: s3://bucket/model-b # Contains model.pt
path: /mnt/models

Result:

/mnt/models
└── model.pt # Undefined which URI this came from

Directory Merging

When downloading to the same path without filename conflicts, directories merge successfully:

storageUris:
- uri: s3://bucket/model-a # Contains model.pt
- uri: s3://bucket/preprocessing # Contains preprocessing.csv

Result:

/mnt/models
├── model.pt
└── preprocessing.csv

Component Support

Multiple storage URIs are available on predictor, transformer, and explainer components:

spec:
predictor:
storageUris:
- uri: s3://bucket/predictor-model
path: /mnt/models/predictor

Run a Prediction

Once your InferenceService is deployed, you can run predictions as usual. The model containers will have access to all configured storage artifacts at their specified mount paths.

Determine the ingress IP and port by following this instruction.

SERVICE_HOSTNAME=$(kubectl get inferenceservice multi-storage-example -o jsonpath='{.status.url}' | cut -d "/" -f 3)

MODEL_NAME=multi-storage-example
curl -v -H "Host: ${SERVICE_HOSTNAME}" \
-H "Content-Type: application/json" \
http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict \
-d @input.json

Limitations

Environment Variable Support

Multiple storage URIs using the STORAGE_URI environment variable is not supported. The STORAGE_URI environment variable exists for legacy transformer compatibility but is no longer needed with the storageUris property.

Path Validation

The system validates paths to prevent directory traversal attacks and ensure all paths share a common root for security and operational consistency.