V1beta1ComponentExtensionSpec¶
ComponentExtensionSpec defines the deployment configuration for a given InferenceService component
Properties¶
| Name | Type | Description | Notes |
|---|---|---|---|
| batcher | V1beta1Batcher | [optional] | |
| canary_traffic_percent | int | CanaryTrafficPercent defines the traffic split percentage between the candidate revision and the last ready revision | [optional] |
| container_concurrency | int | ContainerConcurrency specifies how many requests can be processed concurrently, this sets the hard limit of the container concurrency(https://knative.dev/docs/serving/autoscaling/concurrency). | [optional] |
| logger | V1beta1LoggerSpec | [optional] | |
| max_replicas | int | Maximum number of replicas for autoscaling. | [optional] |
| min_replicas | int | Minimum number of replicas, defaults to 1 but can be set to 0 to enable scale-to-zero. | [optional] |
| timeout | int | TimeoutSeconds specifies the number of seconds to wait before timing out a request to the component. | [optional] |