KServeClient¶
KServeClient(config_file=None, context=None, client_configuration=None, persist_config=True)
User can loads authentication and cluster information from kube-config file and stores them in kubernetes.client.configuration. Parameters are as following:
| parameter | Description | 
|---|---|
| config_file | Name of the kube-config file. Defaults to ~/.kube/config. Note that for the case that
                    the SDK is running in cluster and you want to operate KServe in another remote cluster, user must
                    set config_file to load kube-config file explicitly, e.g.
                    KServeClient(config_file="~/.kube/config").
                   | 
                
| context | Set the active context. If is set to None, current_context from config file will be used. | 
| client_configuration | The kubernetes.client.Configuration to set configs to. | 
| persist_config | If True, config file will be updated when changed (e.g GCP token refresh). | 
The APIs for KServeClient are as following:
| Class | Method | Description | 
|---|---|---|
| KServeClient | set_credentials | Set Credentials | 
| KServeClient | create | Create InferenceService | 
| KServeClient | get | Get or watch the specified InferenceService or all InferenceServices in the namespace | 
| KServeClient | patch | Patch the specified InferenceService | 
| KServeClient | replace | Replace the specified InferenceService | 
| KServeClient | delete | Delete the specified InferenceService | 
| KServeClient | wait_isvc_ready | Wait for the InferenceService to be ready | 
| KServeClient | is_isvc_ready | Check if the InferenceService is ready | 
set_credentials¶
set_credentials(storage_type, namespace=None, credentials_file=None, service_account='kfserving-service-credentials', **kwargs):
Create or update a Secret and Service Account for GCS and S3 for the provided
              credentials. Once the Service Account is applied, it may be used in the
              Service Account field of a InferenceService's V1beta1ModelSpec.
            
Example¶
Example for creating GCP credentials.
from kserve import KServeClient
kserve = KServeClient()
kserve.set_credentials(storage_type='GCS',
                          namespace='kubeflow',
                          credentials_file='/tmp/gcp.json',
                          service_account='user_specified_sa_name')
            The API supports specifying a Service Account by service_account, or using default Service
              Account kfserving-service-credentials, if the Service Account does not exist, the API will
              create it and attach the created secret with the Service Account, if exists, only patch it to attach the
              created Secret.
Example for creating S3 credentials.
from kserve import KServeClient
kserve = KServeClient()
kserve.set_credentials(storage_type='S3',
                          namespace='kubeflow',
                          credentials_file='/tmp/awcredentials',
                          s3_profile='default',
                          s3_endpoint='s3.us-west-amazonaws.com',
                          s3_region='us-west-2',
                          s3_use_https='1',
                          s3_verify_ssl='0')
            Example for creating Azure credentials.
from kserve import KServeClient
kserve = KServeClient()
kserve.set_credentials(storage_type='Azure',
                          namespace='kubeflow',
                          credentials_file='/path/azure_credentials.json')
            The created or patched Secret and Service Account will be shown as following:
            
INFO:kfserving.api.set_credentials:Created Secret: kfserving-secret-6tv6l in namespace kubeflow
INFO:kfserving.api.set_credentials:Created (or Patched) Service account: kfserving-service-credentials in namespace kubeflow
            Parameters¶
| Name | Type | Storage Type | Description | 
|---|---|---|---|
| storage_type | str | All | Required. Valid values: GCS, S3 or Azure | 
| namespace | str | All | Optional. The kubernetes namespace. Defaults to current or default namespace. | 
| credentials_file | str | All | Optional. The path for the credentials file. The default file for GCS is
                    ~/.config/gcloud/application_default_credentials.json, see the instructions
                    on creating the GCS credentials file. For S3 is ~/.aws/credentials, see the instructions
                    on creating the S3 credentials file. For Azure is ~/.azure/azure_credentials.json, see
                    the instructions
                    on creating the Azure credentials file.
                   | 
                
| service_account | str | All | Optional. The name of service account. Supports specifying the service_account, or
                    using default Service Account kfserving-service-credentials. If the Service Account
                    does not exist, the API will create it and attach the created Secret with the Service Account, if
                    exists, only patch it to attach the created Secret. | 
                
| s3_endpoint | str | S3 only | Optional. The S3 endpoint. | 
| s3_region | str | S3 only | Optional. The S3 region By default, regional endpoint is used for S3. | 
| s3_use_https | str | S3 only | Optional. HTTPS is used to access S3 by default, unless s3_use_https=0 | 
                
| s3_verify_ssl | str | S3 only | Optional. If HTTPS is used, SSL verification could be disabled with s3_verify_ssl=0
                   | 
                
create¶
create(inferenceservice, namespace=None, watch=False, timeout_seconds=600)
Create the provided InferenceService in the specified namespace
Example¶
from kubernetes import client
from kserve import KServeClient
from kserve import constants
from kserve import V1beta1PredictorSpec
from kserve import V1beta1TFServingSpec
from kserve import V1beta1InferenceServiceSpec
from kserve import V1beta1InferenceService
default_model_spec = V1beta1InferenceServiceSpec(predictor=V1beta1PredictorSpec(tensorflow=V1beta1TFServingSpec(
    storage_uri='gs://kfserving-examples/models/tensorflow/flowers')))
isvc = V1beta1InferenceService(api_version=constants.KSERVE_V1BETA1,
                          kind=constants.KSERVE_KIND,
                          metadata=client.V1ObjectMeta(name='flower-sample', namespace='kserve-models'),
                          spec=default_model_spec)
kserve = KServeClient()
kserve.create(isvc)
# The API also supports watching the created InferenceService status till it's READY.
# kserve.create(isvc, watch=True)
            Parameters¶
| Name | Type | Description | Notes | 
|---|---|---|---|
| inferenceservice | V1beta1InferenceService | InferenceService defination | Required | 
| namespace | str | Namespace for InferenceService deploying to. If the namespace is not defined, will
                    align with InferenceService definition, or use current or default namespace if namespace is not
                    specified in InferenceService definition. | 
                  Optional | 
| watch | bool | Watch the created InferenceService if True, otherwise will return the created
                    InferenceService object. Stop watching if InferenceService reaches the optional specified
                    timeout_seconds or once the InferenceService overall status READY is
                    True.
                   | 
                  Optional | 
| timeout_seconds | int | Timeout seconds for watching. Defaults to 600. | Optional | 
Return type¶
object
get¶
get(name=None, namespace=None, watch=False, timeout_seconds=600)
Get the created InferenceService in the specified namespace
Example¶
from kserve import KServeClient
kserve = KServeClient()
kserve.get('flower-sample', namespace='kubeflow')
            from kserve import KServeClient
kserve = KServeClient()
kserve.get('flower-sample', namespace='kubeflow', watch=True, timeout_seconds=120)
            timeout_seconds or once the InferenceService overall status READY is
            True.
            NAME                 READY      DEFAULT_TRAFFIC CANARY_TRAFFIC  URL                                               
flower-sample        Unknown                                    http://flower-sample.kubeflow.example.com         
flower-sample        Unknown    90               10             http://flower-sample.kubeflow.example.com         
flower-sample        True       90               10             http://flower-sample.kubeflow.example.com         
            Parameters¶
| Name | Type | Description | Notes | 
|---|---|---|---|
| name | str | InferenceService name. If the name is not specified, it will get or watch all
                    InferenceServices in the namespace. | 
                  Optional. | 
| namespace | str | The InferenceService's namespace. Defaults to current or default namespace. | Optional | 
| watch | bool | Watch the specified InferenceService or all InferenceService in the namespace if
                    True, otherwise will return object for the specified InferenceService or all
                    InferenceService in the namespace. Stop watching if InferenceService reaches the optional specified
                    timeout_seconds or once the speficed InferenceService overall status READY
                    is True (Only if the name is speficed).
                   | 
                  Optional | 
| timeout_seconds | int | Timeout seconds for watching. Defaults to 600. | Optional | 
Return type¶
object
patch¶
patch(name, inferenceservice, namespace=None, watch=False, timeout_seconds=600)
Patch the created InferenceService in the specified namespace.
Note that if you want to set the field from existing value to None, patch API
              may not work, you need to use replace API to remove the field value.
Example¶
from kubernetes import client
from kserve import constants
from kserve import V1beta1PredictorSpec
from kserve import V1beta1TFServingSpec
from kserve import V1beta1InferenceServiceSpec
from kserve import V1beta1InferenceService
from kserve import KServeClient
service_name = 'flower-sample'
kserve = KServeClient()
default_model_spec = V1beta1InferenceServiceSpec(predictor=V1beta1PredictorSpec(tensorflow=V1beta1TFServingSpec(
    storage_uri='gs://kfserving-examples/models/tensorflow/flowers')))
isvc = V1beta1InferenceService(api_version=constants.KSERVE_V1BETA1,
                                   kind=constants.KSERVE_KIND,
                                   metadata=client.V1ObjectMeta(
                                        name=service_name, namespace='kserve-models'),
                                   spec=default_model_spec)
kserve.create(isvc)
kserve.wait_isvc_ready(service_name, namespace='kserve-models')
canary_model_spec = V1beta1InferenceServiceSpec(predictor=V1beta1PredictorSpec(canary_traffic_percent=10,
    tensorflow=V1beta1TFServingSpec(
    storage_uri='gs://kfserving-examples/models/tensorflow/flowers-2')))
isvc = V1beta1InferenceService(api_version= constants.KSERVE_V1BETA1,
                          kind=constants.KSERVE_KIND,
                          metadata=client.V1ObjectMeta(name='flower-sample', namespace='kserve-models'),
                          spec=canary_model_spec)
kserve.patch(service_name, isvc)
# The API also supports watching the patached InferenceService status till it's READY.
# kserve.patch('flower-sample', isvc, watch=True)
            Parameters¶
| Name | Type | Description | Notes | 
|---|---|---|---|
| inferenceservice | V1beta1InferenceService | InferenceService defination | Required | 
| namespace | str | The InferenceService's namespace for patching. If the namespace is not defined, will
                    align with InferenceService definition, or use current or default namespace if namespace is not
                    specified in InferenceService definition. | 
                  Optional | 
| watch | bool | Watch the patched InferenceService if True, otherwise will return the patched
                    InferenceService object. Stop watching if InferenceService reaches the optional specified
                    timeout_seconds or once the InferenceService overall status READY is
                    True.
                   | 
                  Optional | 
| timeout_seconds | int | Timeout seconds for watching. Defaults to 600. | Optional | 
Return type¶
object
replace¶
replace(name, inferenceservice, namespace=None, watch=False, timeout_seconds=600)
Replace the created InferenceService in the specified namespace. Generally use the replace
              API to update whole InferenceService or remove a field such as canary or other components of the
              InferenceService.
Example¶
from kubernetes import client
from kserve import constants
from kserve import V1beta1PredictorSpec
from kserve import V1beta1TFServingSpec
from kserve import V1beta1InferenceServiceSpec
from kserve import V1beta1InferenceService
from kserve import KServeClient
service_name = 'flower-sample'
kserve = KServeClient()
default_model_spec = V1beta1InferenceServiceSpec(predictor=V1beta1PredictorSpec(tensorflow=V1beta1TFServingSpec(
    storage_uri='gs://kfserving-examples/models/tensorflow/flowers')))
isvc = V1beta1InferenceService(api_version=constants.KSERVE_V1BETA1,
                                   kind=constants.KSERVE_KIND,
                                   metadata=client.V1ObjectMeta(
                                        name=service_name, namespace='kserve-models'),
                                   spec=default_model_spec)
kserve.create(isvc)
kserve.wait_isvc_ready(service_name, namespace='kserve-models')
canary_model_spec = V1beta1InferenceServiceSpec(predictor=V1beta1PredictorSpec(canary_traffic_percent=0,
    tensorflow=V1beta1TFServingSpec(
    storage_uri='gs://kfserving-examples/models/tensorflow/flowers-2')))
isvc = V1beta1InferenceService(api_version= constants.KSERVE_V1BETA1,
                          kind=constants.KSERVE_KIND,
                          metadata=client.V1ObjectMeta(name=service_name, namespace='kserve-models'),
                          spec=canary_model_spec)
kserve.replace(service_name, isvc)
# The API also supports watching the replaced InferenceService status till it's READY.
# kserve.replace('flower-sample', isvc, watch=True)
            Parameters¶
| Name | Type | Description | Notes | 
|---|---|---|---|
| inferenceservice | V1beta1InferenceService | InferenceService defination | Required | 
| namespace | str | The InferenceService's namespace. If the namespace is not defined, will align with
                    InferenceService definition, or use current or default namespace if namespace is not specified in
                    InferenceService definition. | 
                  Optional | 
| watch | bool | Watch the patched InferenceService if True, otherwise will return the replaced
                    InferenceService object. Stop watching if InferenceService reaches the optional specified
                    timeout_seconds or once the InferenceService overall status READY is
                    True.
                   | 
                  Optional | 
| timeout_seconds | int | Timeout seconds for watching. Defaults to 600. | Optional | 
Return type¶
object
delete¶
delete(name, namespace=None)
Delete the created InferenceService in the specified namespace
Example¶
from kserve import KServeClient
kserve = KServeClient()
kserve.delete('flower-sample', namespace='kubeflow')
            Parameters¶
| Name | Type | Description | Notes | 
|---|---|---|---|
| name | str | InferenceService name | |
| namespace | str | The inferenceservice's namespace. Defaults to current or default namespace. | Optional | 
Return type¶
object
wait_isvc_ready¶
wait_isvc_ready(name, namespace=None, watch=False, timeout_seconds=600, polling_interval=10):
Wait for the InferenceService to be ready.
Example¶
from kserve import KServeClient
kserve = KServeClient()
kserve.wait_isvc_ready('flower-sample', namespace='kubeflow')
            Parameters¶
| Name | Type | Description | Notes | 
|---|---|---|---|
| name | str | The InferenceService name. | |
| namespace | str | The InferenceService namespace. Defaults to current or default namespace. | Optional | 
| watch | bool | Watch the specified InferenceService if True. | 
                  Optional | 
| timeout_seconds | int | How long to wait for the InferenceService, default wait for 600 seconds. | Optional | 
| polling_interval | int | How often to poll for the status of the InferenceService. | Optional | 
Return type¶
object
is_isvc_ready¶
is_isvc_ready(name, namespace=None)
Returns True if the InferenceService is ready; false otherwise.
Example¶
from kserve import KServeClient
kserve = KServeClient()
kserve.is_isvc_ready('flower-sample', namespace='kubeflow')
            Parameters¶
| Name | Type | Description | Notes | 
|---|---|---|---|
| name | str | The InferenceService name. | |
| namespace | str | The InferenceService namespace. Defaults to current or default namespace. | Optional | 
Return type¶
Bool