KServe Python Serving Runtime SDK
KServe's python runtime API implements a standardized python model server API following Open Inference Protocol, V1 protocol and OpenAI Protocol using FastAPI
.
It encapsulates data plane API definitions and storage retrieval for models.
It provides many functionalities, including among others:
- Implements the data plane API following open inference protocol.
- Provide extensible model server and model API.
- Allow customizing pre-processing, prediction and post-processing handlers.
- Readiness and liveness Handlers.
Installation
KServe Python SDK can be installed by pip
or uv
.
pip install
pip install kserve
uv
Checkout KServe GitHub repository and Install via uv.
cd kserve/python/kserve
uv sync
API Reference
For detailed API reference, please refer to the Python Runtime SDK API Reference.
Storage API
kserve-storage
A Python module for handling model storage and retrieval for KServe. This package provides a unified API to download models from various storage backends including cloud providers, file systems, and model hubs.
This module is mainly used by KServe Storage Initializer
which supports the following cloud storage providers.
Features
- Support for multiple storage backends:
- Local file system
- Google Cloud Storage (GCS)
- Amazon S3
- Azure Blob Storage
- Azure File Share
- HTTP/HTTPS URLs
- HDFS/WebHDFS
- Hugging Face Hub
- Automatic extraction of compressed files (zip, tar.gz, tgz)
- Configuration via environment variables
- Logging and error handling
Installation
pip install kserve-storage
Or with Poetry:
uv add kserve-storage
Usage
The main entry point is the Storage
class which provides a download
method:
from kserve_storage import Storage
# Download from GCS to a temporary directory
model_dir = Storage.download("gs://your-bucket/model")
# Download from S3 to a specific directory
model_dir = Storage.download("s3://your-bucket/model", "/path/to/destination")
Supported Storage Providers
Local File System
model_dir = Storage.download("file:///path/to/model")
# or using direct path
model_dir = Storage.download("/path/to/model")
Google Cloud Storage
model_dir = Storage.download("gs://bucket-name/model-path")
Amazon S3
model_dir = Storage.download("s3://bucket-name/model-path")
Azure Blob Storage
model_dir = Storage.download("https://account-name.blob.core.windows.net/container-name/model-path")
Azure File Share
model_dir = Storage.download("https://account-name.file.core.windows.net/share-name/model-path")
HTTP/HTTPS URLs
model_dir = Storage.download("https://example.com/path/to/model.zip")
HDFS
model_dir = Storage.download("hdfs://path/to/model")
# or WebHDFS
model_dir = Storage.download("webhdfs://path/to/model")
Hugging Face Hub
model_dir = Storage.download("hf://org-name/model-name")
# With specific revision
model_dir = Storage.download("hf://org-name/model-name:revision")
Environment Variables
Hugging Face Hub Configuration
These are all handled by the huggingface_hub
package, you can see all the available environment variables here.
AWS/S3 Configuration / Environment variables
AWS_ENDPOINT_URL
: Custom endpoint URL for S3-compatible storageAWS_ACCESS_KEY_ID
: Access key for S3AWS_SECRET_ACCESS_KEY
: Secret access key for S3AWS_DEFAULT_REGION
: AWS regionAWS_CA_BUNDLE
: Path to custom CA bundleS3_VERIFY_SSL
: Enable/disable SSL verificationS3_USER_VIRTUAL_BUCKET
: Use virtual hosted-style URLsS3_USE_ACCELERATE
: Use transfer accelerationawsAnonymousCredential
: Use unsigned requests for public access
Azure Configuration
AZURE_STORAGE_ACCESS_KEY
: Storage account access keyAZ_TENANT_ID
/AZURE_TENANT_ID
: Azure AD tenant IDAZ_CLIENT_ID
/AZURE_CLIENT_ID
: Azure AD client IDAZ_CLIENT_SECRET
/AZURE_CLIENT_SECRET
: Azure AD client secret
HDFS Configuration
HDFS_SECRET_DIR
: Directory containing HDFS configuration filesHDFS_NAMENODE
: HDFS namenode addressUSER_PROXY
: User proxy for HDFSHDFS_ROOTPATH
: Root path in HDFSKERBEROS_PRINCIPAL
: Kerberos principal for authenticationKERBEROS_KEYTAB
: Path to Kerberos keytab fileTLS_CERT
,TLS_KEY
,TLS_CA
: TLS configuration filesTLS_SKIP_VERIFY
: Skip TLS verificationN_THREADS
: Number of download threads
Storage Configuration
Storage configuration can be provided through environment variables:
STORAGE_CONFIG
: JSON string containing storage configurationSTORAGE_OVERRIDE_CONFIG
: JSON string to override storage configuration
License
Apache License 2.0.