LLMInferenceService Installation
This guide covers installation of the LLMInferenceService controller for generative AI model serving.
Prerequisites
Before installing LLMInferenceService, ensure dependencies are installed:
- LLMIsvc Dependencies: See LLMIsvc Mode Dependencies
Required infrastructure:
- cert-manager
- Gateway API CRDs and Extension CRDs
- Envoy Gateway
- Envoy AI Gateway
- Gateway API resources (GatewayClass, Gateway)
- LWS Operator (LeaderWorkerSet)
- External Load Balancer (for local clusters)
Overview
LLMInferenceService provides optimized serving for generative AI models with features
Installation Methods
Method 1: Kustomize
# Clone KServe repository
git clone https://github.com/kserve/kserve.git
cd kserve
# Install LLMInferenceService standalone
kubectl apply -k config/overlays/standalone/llmisvc
For addon installation (when KServe is already installed):
# Install only LLMInferenceService component (no base resources, reuses existing namespace)
kubectl apply -k config/overlays/addons/llmisvc
For LLMInferenceServiceConfigs installation
kubectl apply -k config/llmisvcconfig
Method 2: Helm
Install CRDs
# Using OCI registry (recommended)
helm install llmisvc-crd oci://ghcr.io/kserve/charts/kserve-llmisvc-crd \
--version v0.17.0-rc1 \
--namespace kserve \
--create-namespace
# Or using local charts
helm install llmisvc-crd ./charts/kserve-llmisvc-crd \
--namespace kserve \
--create-namespace
Install LLMInferenceService Resources
# Using OCI registry (recommended)
helm install llmisvc-resources oci://ghcr.io/kserve/charts/kserve-llmisvc-resources \
--version v0.17.0-rc1 \
--namespace kserve
# Or using local charts
helm install llmisvc-resources ./charts/kserve-llmisvc-resources \
--namespace kserve
Addon Installation (when KServe is already installed):
helm install llmisvc-resources oci://ghcr.io/kserve/charts/kserve-llmisvc-resources \
--version v0.17.0-rc1 \
--namespace kserve \
--set kserve.createSharedResources=false
Install LLMInferenceServiceConfigs
Install pre-configured templates for common LLM frameworks:
helm install kserve-runtime-configs oci://ghcr.io/kserve/charts/kserve-runtime-configs \
--version v0.17.0-rc1 \
--namespace kserve \
--set kserve.llmisvcConfigs.enabled=true
Method 3: Installation Scripts
Quick Install (All-in-One)
cd kserve
# Install dependencies + LLMInferenceService
./hack/setup/quick-install/llmisvc-full-install-helm.sh
# Or use with-manifest version (no clone needed, includes embedded manifests)
./hack/setup/quick-install/llmisvc-full-install-helm-with-manifest.sh
Configuration Options
For detailed configuration options including gateway settings, resource limits, config templates, and multi-node configurations, see the kserve-llmisvc-resources Helm Chart README.
Test Installation
To test your LLMInferenceService installation with a sample, see the Getting Started with LLMInferenceService.
Uninstallation
Helm
# Remove resources
helm uninstall kserve-runtime-configs -n kserve
helm uninstall llmisvc-resources -n kserve
helm uninstall llmisvc-crd -n kserve
# Remove namespace (if not shared)
kubectl delete namespace kserve
Kustomize
# Remove LLMInferenceService
kubectl delete -k config/overlays/standalone/llmisvc
Scripts
# Uninstall using script
./hack/setup/quick-install/llmisvc-full-install-helm.sh --uninstall
Next Steps
- Install LocalModel Controller - For model caching
- Getting Started with LLMInferenceService - Deploy your first generative AI model
- LLMInferenceService Configuration Guide - Advanced configuration