-
Notifications
You must be signed in to change notification settings - Fork 41
20251126 sync upstream #993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…4663) Signed-off-by: Pierangelo Di Pilato <[email protected]>
…serve#4664) Signed-off-by: Pierangelo Di Pilato <[email protected]>
Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
Signed-off-by: Bartosz Majsak <[email protected]> Co-authored-by: Jooho Lee <[email protected]>
Signed-off-by: Spolti <[email protected]>
chore: Update the e2e tests to not deploy Serverless when running graph tests. Signed-off-by: Spolti <[email protected]>
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mholder6 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
WalkthroughThis PR refactors the storage module into a standalone package, updates CRD schemas to accept flexible adapter configurations, refreshes LLM container images with enhanced security contexts, separates E2E tests for the explainer component, and adds storage module installation to multiple Dockerfiles. Changes
Sequence Diagram(s)sequenceDiagram
participant Consumer as Python Server
participant KServeOld as kserve[storage]
participant KServeNew as kserve (main)
participant StorageNew as kserve-storage
Consumer->>KServeOld: import Storage (old flow)
KServeOld->>KServeOld: from kserve.storage
Note over Consumer,StorageNew: After refactoring:
Consumer->>KServeNew: import kserve
Consumer->>StorageNew: from kserve_storage import Storage
StorageNew->>StorageNew: Standalone package
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Areas requiring extra attention:
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
config/llmisvc/config-llm-worker-data-parallel.yaml (1)
18-183: The repository clone failed, but I can still verify the issues using the code snippet provided in the review comment and applying general principles of shell scripting and Go templating. Let me analyze the identified issues:Analysis of the three issues:
Broken quoting around
--served-model-name: The review shows code like:eval "vllm serve \ /mnt/models \ --served-model-name "{{ .Spec.Model.Name }}" \This is indeed a quoting error. Inside a double-quoted string, the inner
"will prematurely terminate the outer quote, breaking the shell syntax.
$(LWS_LEADER_ADDRESS)vs$LWS_LEADER_ADDRESS: The code uses$(LWS_LEADER_ADDRESS)which in shell syntax means "execute command named LWS_LEADER_ADDRESS". This is universally true in POSIX shells - command substitution syntax is standard.Pointer-valued
.Spec.Parallelism.*fields: The review claims these are Go*int32types. The code snippet shows usage like{{ .Spec.Parallelism.Tensor }}which would render the pointer value itself rather than the dereferenced integer without a helper function. This is a standard limitation of Go templates.These are fundamental shell and template syntax issues that don't require repository-specific context to verify. The issues identified are conceptually sound based on universal shell and Go templating mechanics.
Shell template has quoting and expansion bugs that will likely break the worker pods.
There are three correctness issues in the new
vllm serveshell snippets that are serious enough to prevent the containers from starting correctly:
Broken quoting around
--served-model-name(template + worker)
In bothtemplateandworkersections, lines like:eval "vllm serve \ /mnt/models \ --served-model-name "{{ .Spec.Model.Name }}" \inject
"{{ .Spec.Model.Name }}"inside an already double‑quoted string, which terminates the outer quote early and produces invalid shell syntax once the template is rendered. This is a hard runtime error.Remove the inner quotes:
eval "vllm serve \/mnt/models \--served-model-name "{{ .Spec.Model.Name }}" \
eval "vllm serve \/mnt/models \--served-model-name {{ .Spec.Model.Name }} \Apply the same change in the `worker` block. If you need to support model names with spaces, escape the inner quotes instead: `--served-model-name \"{{ .Spec.Model.Name }}\"`.
$(LWS_LEADER_ADDRESS)is command substitution, not env var expansion
Lines like:use POSIX command substitution, which attempts to execute a command named--data-parallel-address $(LWS_LEADER_ADDRESS) \LWS_LEADER_ADDRESS. To use the environment variable, change to:
--data-parallel-address $(LWS_LEADER_ADDRESS) \
--data-parallel-address $LWS_LEADER_ADDRESS \Apply the same change in the `worker` block.
- Pointer-valued
.Spec.Parallelism.*fields in templates
Using Go pointer types directly in templates like:will render pointer values (e.g., memory addresses) rather than integers unless dereferenced. Either add template helper functions to dereference these pointers or change the underlying API fields to non-pointer numeric types.{{- if .Spec.Parallelism.Tensor -}}--tensor-parallel-size {{ .Spec.Parallelism.Tensor }}{{- end }} --data-parallel-size-local {{ or .Spec.Parallelism.DataLocal 1 }} \pkg/controller/v1alpha1/llmisvc/config_presets_test.go (1)
431-434: Remove duplicateyaml.Unmarshalcall.The same
yaml.Unmarshal(data, config)is called twice consecutively with identical error handling. This appears to be an accidental duplication.func loadConfig(t *testing.T, data []byte, filePath string) *v1alpha1.LLMInferenceServiceConfig { config := &v1alpha1.LLMInferenceServiceConfig{} if err := yaml.Unmarshal(data, config); err != nil { t.Errorf("Failed to unmarshal YAML from %s: %v", filePath, err) return nil } - if err := yaml.Unmarshal(data, config); err != nil { - t.Errorf("Failed to unmarshal YAML from %s: %v", filePath, err) - return nil - } expectedGroupVersion := v1alpha1.LLMInferenceServiceConfigGVK.GroupVersion().String()charts/llmisvc-resources/templates/config-llm-worker-data-parallel.yaml (1)
6-103: Consolidate duplicated template and worker configurations to reduce maintenance burden.The
templatesection (lines 6–103) andworkersection (lines 104–183) contain nearly identical container configurations, with only minor differences:
- Line 25 (template only):
--api-server-count ${VLLM_API_SERVER_COUNT:-8}- Line 118 vs. 20:
START_RANKcalculation differs (worker uses${LWS_WORKER_INDEX:-0})- Line 134 (worker only):
--headlessflagThis duplicated structure creates a maintenance risk. If you need to refactor these configurations, consider using YAML anchors (
&and*), shared templates, or a template engine feature to DRY up the code.Example using YAML anchors (if supported by your tooling):
_container_config: &container_config imagePullPolicy: IfNotPresent name: main ports: - containerPort: 8000 protocol: TCP # ... shared fields ... template: containers: - <<: *container_config # template-specific overrides worker: containers: - <<: *container_config # worker-specific overridesAlso applies to: 104-183
♻️ Duplicate comments (5)
config/llmisvc/config-llm-template.yaml (2)
8-8: [Duplicate concern from charts template] Verify dev image and model path breaking change.Same concerns as
charts/llmisvc-resources/templates/config-llm-template.yaml: dev image validity, hardcoded/mnt/modelsvs. templated model name in args, and ephemeral model-cache storage.Also applies to: 17-17
38-44: [Duplicate concern] Security:readOnlyRootFilesystem: falsecombined withrunAsNonRoot: true.Same concern as charts template: consider enabling
readOnlyRootFilesystem: truewith explicit writable volume mounts for improved security posture.charts/llmisvc-resources/templates/config-llm-prefill-template.yaml (2)
9-9: [Consistent with main template] Verify dev image and model path breaking change.Same pattern as main template (File 1): dev image
v0.2.2-dev, hardcoded/mnt/modelscommand path, and templated model name in args (line 21). Requires same verification: dev image validity, intentionality of hardcoded path, and model-cache persistence strategy.Also applies to: 18-18
39-45: [Consistent with main template] Security:readOnlyRootFilesystem: falsecombined withrunAsNonRoot: true.Same security posture consideration as main template: evaluate enabling
readOnlyRootFilesystem: truewith explicit writable volume mounts.config/llmisvc/config-llm-prefill-template.yaml (1)
9-9: [Duplicate] Same concerns as charts/llmisvc-resources/templates/config-llm-prefill-template.yaml.This config file mirrors the charts template. Refer to earlier comments on dev image validity, model path breaking change, and security context configuration.
Also applies to: 18-18, 39-45
🧹 Nitpick comments (17)
python/lgb.Dockerfile (1)
28-33: Align storage build pattern with kserve and lgbserver.The storage build uses
uv pip install . --no-cache(line 33) while kserve (line 26) and lgbserver (line 40) both useuv sync --active --no-cacheafter copying their source directories. This inconsistency may indicate a refactoring oversight or an intentional difference that should be clarified.For consistency and to follow the established pattern in this Dockerfile, consider updating line 33 to use
uv syncinstead:COPY storage storage -RUN cd storage && uv pip install . --no-cache +RUN cd storage && uv sync --active --no-cacheIf
uv pip install .is intentional (e.g., to avoid re-locking dependencies), document the rationale.pkg/testing/httproute_matchers.go (2)
17-29: Package nametestingunderpkg/testingcan be confusing with stdlibtestingUsing
package testingfor helpers that live in non-_test.gocode can be easy to confuse with the standard librarytestingpackage when importing both. Consider renaming the package (e.g.testutilorksrvtesting) or consistently aliasing this import in tests to avoid ambiguity.
31-51:extractHTTPRoutelogic looks solid; consider clarifying the error textThe type-switch coverage and nil-pointer handling look correct for
HTTPRoute/HTTPRouteSpec(by value and pointer). The only nit is the default error string:return nil, fmt.Errorf("expected gatewayapi.HTTPRoute gatewayapi.HTTPRouteSpec, but got %T", actual)You may want to tweak it to read more clearly, e.g.:
- return nil, fmt.Errorf("expected gatewayapi.HTTPRoute gatewayapi.HTTPRouteSpec, but got %T", actual) + return nil, fmt.Errorf("expected gatewayapi.HTTPRoute or gatewayapi.HTTPRouteSpec, but got %T", actual)Purely cosmetic, but it makes debugging mis-typed matcher inputs easier.
.github/workflows/python-test.yml (1)
77-81: Verify storage test environment and coverage targetGood to see storage tests wired into CI. A couple of things to double-check:
- Reusing the
python/kserve/.venvvenv is fine as long as it actually has thekserve_storagepackage and its dependencies installed (either from the in-repopython/storageproject or from the pinnedkserve-storageextra).- Given the implementation lives under
kserve_storage, confirm that--cov=storageis the intended coverage target; you may want--cov=kserve_storage(or similar) so coverage is reported against the real package.python/storage/pyproject.toml (1)
15-16: Confirm huggingface-hub floor and kserve-storage version alignmentRaising the
huggingface-hub[hf-transfer]minimum and settingkserve-storageto0.15.2look reasonable, but please verify:
- That all consumers (especially HuggingFace server images) are compatible with
huggingface-hub>=0.32.0.- How this
0.15.2version relates to thekserve-storage==2pin inpython/kserve/pyproject.toml; if they are meant to represent the same package lineage, aligning or clearly differentiating these versions will avoid confusion.Also applies to: 18-27
python/kserve/pyproject.toml (1)
58-66: Revisit kserve-storage pin and temporary uv index configurationThe
storageextra now pinskserve-storage==2, while the in-repo storage project is versioned0.15.2. If these are intended to represent the same package, consider aligning versions or using a clearer constraint so it’s obvious which source (local vs published) is authoritative.The new
[tool.uv]block pointing tohttps://test.pypi.org/simple/with a TODO is fine for short-term testing, but please ensure this doesn’t leak into production/official release workflows and plan to switch back to the standard PyPI index (or a controlled internal index) once the temporary flow is no longer needed.Also applies to: 97-101
.github/workflows/e2e-test.yml (1)
418-423: Good test isolation for explainer component.Separating explainer E2E tests into a dedicated step improves test organization and makes failures easier to diagnose. The lower parallelism count (1 vs 6) for explainer tests may be intentional if there are fewer test cases.
Consider verifying whether the explainer test suite would benefit from higher parallelism (count > 1) to reduce overall test execution time, especially as the test suite grows.
python/storage/kserve_storage/kserve_storage.py (1)
788-796: Archive extraction improvements with proper resource management.The use of context managers for both tar and zip extraction is good practice. The tar extraction includes a data filter for security.
Static analysis flagged the use of
tarfile.extractall()(S202). While thefilter="data"parameter mitigates path traversal risks in Python 3.12+, consider verifying that the minimum Python version requirement (3.9 per line 6) supports this filter. For older Python versions, you may need additional safeguards or path validation.Apply this verification script to check Python version compatibility:
#!/bin/bash # Check Python version requirements across storage-related files rg -n "requires-python" python/storage/pyproject.toml python/*/pyproject.tomlpython/huggingface_server.Dockerfile (1)
155-155: Verify ownership of copied storage directory.Line 154 copies the kserve directory with
--chown=kserve:kserve, but line 155 copies the storage directory without explicit ownership. Since the final image runs as user 1000 (kserve), consider adding--chown=kserve:kserveto ensure consistent ownership.Apply this diff to ensure consistent ownership:
COPY --from=build ${WORKSPACE_DIR}/kserve kserve -COPY --from=build ${WORKSPACE_DIR}/storage storage +COPY --from=build --chown=kserve:kserve ${WORKSPACE_DIR}/storage storage COPY --from=build ${WORKSPACE_DIR}/huggingfaceserver huggingfaceserverconfig/llmisvc/config-llm-scheduler.yaml (1)
50-66: Validate new scheduler metrics flags and TLS verification posture.The added args
--modelServerMetricsScheme,"https", and--modelServerMetricsHttpsInsecureSkipVerifylook reasonable for enabling HTTPS metrics, but:
- Ensure the
llm-d-inference-scheduler:v0.2.0binary actually supports these flags (otherwise the pod will fail to start).- Double-check that disabling TLS verification for model-server metrics is acceptable in your threat model; if not, you may want a variant without
...HttpsInsecureSkipVerifyor make it configurable instead of hard-coded.config/llmisvc/config-llm-decode-template.yaml (1)
75-91: Init routing-sidecar securityContext: consider adding seccomp profile for parity.You’ve locked down the init container with
allowPrivilegeEscalation: false,runAsNonRoot: true, anddrop: [ALL], which is good. For consistency with the main container and the broader hardening effort, you may also want to addseccompProfile: { type: RuntimeDefault }here unless there’s a specific reason not to.charts/llmisvc-resources/templates/config-llm-template.yaml (1)
38-44: Security:readOnlyRootFilesystem: falsecombined withrunAsNonRoot: true.The container permits filesystem writes despite restricting privilege escalation. If the intent is to protect the root filesystem, consider enabling
readOnlyRootFilesystem: trueand explicitly mounting only writable volumes (e.g., emptyDir, configMap) at specific paths required by the application.pkg/controller/v1alpha1/llmisvc/lifecycle_crud.go (1)
157-160: Consider usingstrings.TrimPrefixfor clarity.
strings.Replacewith a count of 1 works, butstrings.TrimPrefixbetter expresses the intent of removing a leading character.func logLineForObject(obj client.Object) string { // Note: don't use `obj.GetObjectKind()` as it's always empty. - return strings.Replace(reflect.TypeOf(obj).String(), "*", "", 1) + return strings.TrimPrefix(reflect.TypeOf(obj).String(), "*") }charts/llmisvc-resources/templates/config-llm-worker-data-parallel.yaml (3)
6-103: Add resource requests and limits for predictable cluster scheduling.Neither the
templatenorworkercontainers defineresources.requestsorresources.limits. This omission can cause unpredictable cluster behavior, poor bin-packing, and potential node pressure. LLM inference workloads typically require substantial memory and GPU/compute resources.Add appropriate resource specifications, for example:
securityContext: # ... existing security context ... resources: requests: memory: "8Gi" cpu: "2" nvidia.com/gpu: "1" # if GPU is needed limits: memory: "16Gi" cpu: "4" nvidia.com/gpu: "1"Adjust the values to match your actual workload requirements.
Also applies to: 104-183
49-60: Justify adding privileged capabilities despite hardening efforts.The security context drops all capabilities (
drop: ALL) but selectively addsIPC_LOCKandSYS_RAWIO(lines 54–56, 152–154). While this hardened approach is sound, these are privileged capabilities that deserve documentation. Add an inline comment explaining why they are required (e.g., "IPC_LOCK for memory pinning in vLLM inference, SYS_RAWIO for hardware acceleration").Also applies to: 147-158
51-51: Clarify thereadOnlyRootFilesystem: falsesetting.Setting
readOnlyRootFilesystem: false(lines 51, 149) somewhat undermines the security hardening achieved by dropping capabilities and enforcingrunAsNonRoot: true. If the writable filesystem is necessary for:
/home(emptyDir for temporary files),/dev/shm(shared memory),/models(model caching),document this trade-off with an inline comment. Alternatively, verify that all necessary directories are indeed mounted as writable volumes and the root filesystem itself can remain read-only.
Also applies to: 149-149
config/llmisvc/config-llm-prefill-worker-data-parallel.yaml (1)
1-184: Significant duplication with Helm chart template.This file is nearly identical to
charts/llmisvc-resources/templates/config-llm-prefill-worker-data-parallel.yaml. Having two separate copies of the same configuration increases maintenance burden and risk of configuration drift.Consider consolidating these configurations or generating one from the other to maintain a single source of truth.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (15)
python/aiffairness/uv.lockis excluded by!**/*.lockpython/artexplainer/uv.lockis excluded by!**/*.lockpython/custom_model/uv.lockis excluded by!**/*.lockpython/custom_tokenizer/uv.lockis excluded by!**/*.lockpython/custom_transformer/uv.lockis excluded by!**/*.lockpython/huggingfaceserver/uv.lockis excluded by!**/*.lockpython/kserve/uv.lockis excluded by!**/*.lockpython/lgbserver/uv.lockis excluded by!**/*.lockpython/paddleserver/uv.lockis excluded by!**/*.lockpython/pmmlserver/uv.lockis excluded by!**/*.lockpython/sklearnserver/uv.lockis excluded by!**/*.lockpython/storage/uv.lockis excluded by!**/*.lockpython/test_resources/graph/error_404_isvc/uv.lockis excluded by!**/*.lockpython/test_resources/graph/success_200_isvc/uv.lockis excluded by!**/*.lockpython/xgbserver/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (74)
.flake8(1 hunks).github/workflows/e2e-test.yml(2 hunks).github/workflows/python-test.yml(1 hunks).github/workflows/scheduled-image-scan.yml(4 hunks)charts/llmisvc-crd/templates/serving.kserve.io_llminferenceserviceconfigs.yaml(1 hunks)charts/llmisvc-crd/templates/serving.kserve.io_llminferenceservices.yaml(1 hunks)charts/llmisvc-resources/templates/config-llm-decode-template.yaml(6 hunks)charts/llmisvc-resources/templates/config-llm-decode-worker-data-parallel.yaml(7 hunks)charts/llmisvc-resources/templates/config-llm-prefill-template.yaml(4 hunks)charts/llmisvc-resources/templates/config-llm-prefill-worker-data-parallel.yaml(4 hunks)charts/llmisvc-resources/templates/config-llm-router-route.yaml(1 hunks)charts/llmisvc-resources/templates/config-llm-scheduler.yaml(2 hunks)charts/llmisvc-resources/templates/config-llm-template.yaml(4 hunks)charts/llmisvc-resources/templates/config-llm-worker-data-parallel.yaml(4 hunks)config/crd/full/serving.kserve.io_llminferenceserviceconfigs.yaml(1 hunks)config/crd/full/serving.kserve.io_llminferenceservices.yaml(1 hunks)config/llmisvc/config-llm-decode-template.yaml(6 hunks)config/llmisvc/config-llm-decode-worker-data-parallel.yaml(7 hunks)config/llmisvc/config-llm-prefill-template.yaml(4 hunks)config/llmisvc/config-llm-prefill-worker-data-parallel.yaml(4 hunks)config/llmisvc/config-llm-router-route.yaml(1 hunks)config/llmisvc/config-llm-scheduler.yaml(2 hunks)config/llmisvc/config-llm-template.yaml(4 hunks)config/llmisvc/config-llm-worker-data-parallel.yaml(4 hunks)config/llmisvc/kustomization.yaml(1 hunks)docs/samples/explanation/alibi/alibiexplainer/alibiexplainer/__main__.py(1 hunks)docs/samples/explanation/alibi/alibiexplainer/tests/test_anchor_images.py(1 hunks)docs/samples/explanation/alibi/alibiexplainer/tests/test_anchor_tabular.py(1 hunks)pkg/apis/serving/v1alpha1/llm_inference_service_defaults.go(1 hunks)pkg/apis/serving/v1alpha1/llm_inference_service_types.go(1 hunks)pkg/apis/serving/v1alpha1/zz_generated.deepcopy.go(1 hunks)pkg/controller/v1alpha1/llmisvc/config_merge.go(11 hunks)pkg/controller/v1alpha1/llmisvc/config_merge_test.go(15 hunks)pkg/controller/v1alpha1/llmisvc/config_presets_test.go(8 hunks)pkg/controller/v1alpha1/llmisvc/lifecycle_crud.go(5 hunks)pkg/openapi/openapi_generated.go(0 hunks)pkg/testing/httproute_matchers.go(1 hunks)pkg/testing/log.go(1 hunks)python/huggingface_server.Dockerfile(2 hunks)python/huggingface_server_cpu.Dockerfile(2 hunks)python/huggingfaceserver/huggingfaceserver/__main__.py(1 hunks)python/huggingfaceserver/pyproject.toml(1 hunks)python/kserve/Makefile(1 hunks)python/kserve/kserve/storage/__init__.py(0 hunks)python/kserve/kserve/storage/storage.py(0 hunks)python/kserve/kserve/storage/test/test_azure_storage.py(0 hunks)python/kserve/kserve/storage/test/test_gcs_storage.py(0 hunks)python/kserve/kserve/storage/test/test_hf_storage.py(0 hunks)python/kserve/kserve/storage/test/test_s3_storage.py(0 hunks)python/kserve/kserve/storage/test/test_storage.py(0 hunks)python/kserve/pyproject.toml(2 hunks)python/lgb.Dockerfile(2 hunks)python/lgbserver/lgbserver/model.py(1 hunks)python/lgbserver/pyproject.toml(1 hunks)python/paddle.Dockerfile(2 hunks)python/paddleserver/paddleserver/model.py(1 hunks)python/paddleserver/pyproject.toml(1 hunks)python/pmml.Dockerfile(2 hunks)python/pmmlserver/pmmlserver/model.py(1 hunks)python/pmmlserver/pyproject.toml(1 hunks)python/sklearn.Dockerfile(2 hunks)python/sklearnserver/pyproject.toml(1 hunks)python/sklearnserver/sklearnserver/model.py(1 hunks)python/storage-initializer.Dockerfile(2 hunks)python/storage-initializer/scripts/initializer-entrypoint(1 hunks)python/storage/kserve_storage/kserve_storage.py(4 hunks)python/storage/pyproject.toml(2 hunks)python/storage/test/test_gcs_storage.py(2 hunks)python/xgb.Dockerfile(2 hunks)python/xgbserver/pyproject.toml(1 hunks)python/xgbserver/xgbserver/model.py(1 hunks)test/crds/serving.kserve.io_all_crds.yaml(2 hunks)test/e2e/explainer/test_art_explainer.py(1 hunks)test/e2e/graph/test_inference_graph.py(1 hunks)
💤 Files with no reviewable changes (8)
- pkg/openapi/openapi_generated.go
- python/kserve/kserve/storage/test/test_hf_storage.py
- python/kserve/kserve/storage/test/test_storage.py
- python/kserve/kserve/storage/storage.py
- python/kserve/kserve/storage/test/test_azure_storage.py
- python/kserve/kserve/storage/init.py
- python/kserve/kserve/storage/test/test_gcs_storage.py
- python/kserve/kserve/storage/test/test_s3_storage.py
🧰 Additional context used
🧬 Code graph analysis (14)
docs/samples/explanation/alibi/alibiexplainer/tests/test_anchor_images.py (1)
python/storage/kserve_storage/kserve_storage.py (1)
Storage(59-798)
python/sklearnserver/sklearnserver/model.py (1)
python/storage/kserve_storage/kserve_storage.py (1)
Storage(59-798)
python/xgbserver/xgbserver/model.py (1)
python/storage/kserve_storage/kserve_storage.py (1)
Storage(59-798)
pkg/apis/serving/v1alpha1/zz_generated.deepcopy.go (1)
pkg/apis/serving/v1alpha1/llm_inference_service_types.go (1)
LLMModelSpec(118-137)
python/huggingfaceserver/huggingfaceserver/__main__.py (1)
python/storage/kserve_storage/kserve_storage.py (1)
Storage(59-798)
pkg/apis/serving/v1alpha1/llm_inference_service_defaults.go (1)
pkg/apis/serving/v1alpha1/llm_inference_service_types.go (3)
LLMInferenceService(40-46)LLMInferenceServiceSpec(60-89)WorkloadSpec(92-115)
pkg/testing/httproute_matchers.go (2)
pkg/controller/v1alpha1/llmisvc/fixture/gwapi_builders.go (2)
HTTPRoute(99-118)BackendRefs(308-310)pkg/apis/serving/v1alpha1/llm_inference_service_types.go (1)
HTTPRouteSpec(184-194)
docs/samples/explanation/alibi/alibiexplainer/alibiexplainer/__main__.py (1)
python/storage/kserve_storage/kserve_storage.py (1)
Storage(59-798)
python/paddleserver/paddleserver/model.py (1)
python/storage/kserve_storage/kserve_storage.py (1)
Storage(59-798)
python/pmmlserver/pmmlserver/model.py (1)
python/storage/kserve_storage/kserve_storage.py (1)
Storage(59-798)
docs/samples/explanation/alibi/alibiexplainer/tests/test_anchor_tabular.py (1)
python/storage/kserve_storage/kserve_storage.py (1)
Storage(59-798)
pkg/controller/v1alpha1/llmisvc/config_merge.go (2)
pkg/apis/serving/v1alpha1/llm_inference_service_types.go (3)
LLMInferenceServiceSpec(60-89)LLMInferenceService(40-46)SchedulerSpec(222-231)pkg/controller/v1alpha1/llmisvc/fixture/gwapi_builders.go (1)
BackendRefs(308-310)
pkg/controller/v1alpha1/llmisvc/config_presets_test.go (1)
pkg/apis/serving/v1beta1/podspec.go (1)
PodSpec(28-339)
pkg/controller/v1alpha1/llmisvc/config_merge_test.go (5)
pkg/apis/serving/v1alpha1/llm_inference_service_types.go (6)
LLMModelSpec(118-137)LLMInferenceServiceSpec(60-89)WorkloadSpec(92-115)LoRASpec(140-148)LLMInferenceServiceConfig(52-57)LLMInferenceService(40-46)pkg/apis/serving/v1beta1/podspec.go (1)
PodSpec(28-339)pkg/testing/log.go (1)
NewTestLogger(37-39)pkg/controller/v1alpha1/llmisvc/config_merge.go (2)
MergeSpecs(283-298)ReplaceVariables(234-262)pkg/testing/httproute_matchers.go (1)
HaveGatewayRefs(54-58)
🪛 Ruff (0.14.5)
python/storage/kserve_storage/kserve_storage.py
792-792: Uses of tarfile.extractall()
(S202)
794-796: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: precommit-check
- GitHub Check: Test
- GitHub Check: build (3.11)
- GitHub Check: build (3.10)
- GitHub Check: build (3.12)
- GitHub Check: build (3.9)
- GitHub Check: test
| - storageUri | ||
| type: object | ||
| type: array | ||
| x-kubernetes-preserve-unknown-fields: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Script executed:
# First, let's examine the CRD file around line 49 to see the full context
cat -n charts/llmisvc-crd/templates/serving.kserve.io_llminferenceserviceconfigs.yaml | sed -n '40,60p'Repository: opendatahub-io/kserve
<shell_result>
Repository: opendatahub-io/kserve
Script execution failed with unrecoverable error: [CAPABILITY:RUN_SHELL_SCRIPT] Failed to clone repository opendatahub-io/kserve: Clone operation failed
</shell_result>
kserve llminferenceservice preserveUnknownFields adapters schema
<web_search_result>
Short answer: set the CRD schema to preserve unknown fields using the OpenAPI vendor extension (x-kubernetes-preserve-unknown-fields: true) on the appropriate openAPIV3Schema node that covers your adapters/LoRA/adapter-like subobject. The legacy CRD field preserveUnknownFields is deprecated — prefer x-kubernetes-preserve-unknown-fields in spec.versions[*].schema.openAPIV3Schema.
Why/how (concise):
- KServe exposes LLMInferenceService as a CRD (so schema/pruning is controlled by the CRD’s openAPIV3Schema). [1]
- Kubernetes supports x-kubernetes-preserve-unknown-fields in openAPIV3Schema to prevent pruning of arbitrary/unknown fields (and notes preserveUnknownFields is deprecated in favor of that extension). Use the extension at the root or the nested property that must remain schemaless. [2][3]
Example (conceptual): inside the CRD’s spec.versions[].schema.openAPIV3Schema place:
type: object
x-kubernetes-preserve-unknown-fields: true
(or add the extension under the specific property for adapters)
References:
- KServe LLMInferenceService CRD docs. [1]
- Kubernetes blog/docs on structural schemas and x-kubernetes-preserve-unknown-fields. [2]
- Kubernetes API/CRD field deprecation note (preserveUnknownFields → x-kubernetes-preserve-unknown-fields). [3]
</web_search_result>
opendatahub kserve CRD serving.kserve.io llminferenceserviceconfigs
<web_search_result>
Quick summary — what it is and where to find it
-
LLMInferenceServiceConfig is a KServe CRD (API group serving.kserve.io, v1alpha1) that provides reusable/template configuration for LLMInferenceService resources (i.e., base settings a concrete LLMInferenceService can inherit). It contains fields such as model spec (URI, LoRA adapters, storage), replicas, parallelism (tensor/pipeline), pod templates (template/worker) for single- or multi-node/disaggregated deployments. [1]
-
OpenDataHub bundles KServe for single-model serving (vLLM/etc.) and uses these KServe CRDs (ServingRuntime, InferenceService, and the newer LLMInferenceService/LLMInferenceServiceConfig) as part of its model-serving platform and documentation. See the OpenDataHub serving and deployment docs for examples and how KServe is installed/used within OpenDataHub. [2][3]
Useful links (reference)
- KServe CRD API / Control Plane docs (LLMInferenceService and LLMInferenceServiceConfig schema). [1]
- OpenDataHub: Serving models (KServe integration, runtimes, InferenceService examples). [2]
- OpenDataHub: Deploying / configuring model-serving platform (installation and runtime notes). [3]
If you want, I can:
- show an example YAML for an LLMInferenceServiceConfig (based on KServe docs), or
- show how OpenDataHub deploys a vLLM ServingRuntime and references such configs.
</web_search_result>
kserve LLMInferenceService adapters framework memory storageUri documentation
<web_search_result>
Summary of relevant KServe docs and notes about LLMInferenceService adapters, memory, and storageUri
-
LLMInferenceService docs (overview + tutorial) — describes the LLMInferenceService CRD, how it integrates llm-d, templates/adapters via spec.template (containers, resources including memory), and router/gateway/scheduler configuration. Example shows memory in container resource requests/limits. [1][2]
-
storageUri / URI handling — KServe supports model artifacts via storage URIs (http/https, s3, gcs, hf://, etc.) and documents using storageUri with secrets for headers; used by the storage initializer to download model artifacts. Example and usage notes are in the URI docs. [3]
-
Multiple storageUri / adapters (LoRA / adapters) — KServe currently supports a single storageUri per model; there is an open feature request to support multiple storageUri (for use cases like base model + LoRA adapters coming from different buckets). That issue is tracked on GitHub (Support multiple StorageUri in Inference Service kserve/kserve#3413). [4]
Practical implications
- To set memory for LLM pods, specify resource.requests/limits.memory under spec.template.containers in LLMInferenceService (as shown in the LLM examples). [1][2]
- For adapter/LoRA workflows where base model and adapters live separately, you’ll need a workaround (e.g., build a single artifact combining base+adapter or use init containers/custom storage init logic) until multi-storageUri support is implemented; follow or contribute to the GitHub issue for progress. [3][4]
Links (docs referenced)
- LLMInferenceService overview & examples. [1][2]
- storageUri / URI docs (storage initializer usage). [3]
- GitHub issue tracking multiple storageUri feature. [4]
If you want, I can:
- show an example LLMInferenceService YAML highlighting memory and storageUri fields, or
- suggest a concrete workaround (initContainer, sidecar, or fused artifact) for using separate adapter files with a base model.
</web_search_result>
site:github.com opendatahub kserve llminferenceservice adapter schema
<web_search_result>
I searched GitHub for KServe / OpenDataHub references to LLMInferenceService and adapter/schema. Relevant results:
- KServe main repo (contains InferenceService / LLM work and CRD issues). [1]
- Issue proposing/supporting OpenAI schema for KServe LLM runtime (generate/OpenAI-compatible API). [2]
- KServe milestone and recent LLM-related issues (DistributedInferenceService / LLMInferenceService work). [3]
- OpenDataHub operator releases that reference bundled KServe/ModelMesh runtime-adapter versions. [4]
References
[1] https://github.com/kserve/kserve
[2] kserve#3419
[3] kserve#4433
[4] https://github.com/opendatahub-io/opendatahub-operator/releases
If you want, I can open specific files (CRD schemas, adapter code, or the LLMInferenceService API) and list the exact schema fields or file paths. Which would you like me to fetch?
</web_search_result>
"x-kubernetes-preserve-unknown-fields" CRD adapter schema benefits use cases
<web_search_result>
Summary — what x-kubernetes-preserve-unknown-fields does
- It's an OpenAPI vendor extension you place on an object-schema (or nested properties/additionalProperties) to tell the API server: do NOT prune/strip fields that are not described in the structural schema for that node. [1][2]
Benefits
- Allows CRDs to carry arbitrary or opaque JSON blobs (runtime.RawExtension–like data) inside a structural schema. [1]
- Enables backward/forward compatibility when controllers/operators may add new fields that not-yet-present clients or CRD schemas would otherwise prune. [1][3]
- Lets you combine structural validation for most of a resource while permitting “escape hatches” for extensible subtrees (plugins, vendor extensions, unstructured data). [1][2]
Common use cases
- Fields that must accept arbitrary JSON (e.g., embedded 3rd‑party resource specs, plugin configuration, policy blobs). [1]
- Migrating legacy non-structural CRDs to structural schemas: mark parts with the extension so existing stored unknown fields survive conversion. [1][3]
- Controllers that persist dynamic/unknown status or spec keys (adapters, operators that store discovered state). [2][3]
- Implementing an “embedded resource” pattern (apiVersion/kind/metadata plus opaque body): use together with x-kubernetes-embedded-resource or preserve unknowns inside that subtree. [1]
Caveats / trade-offs
- Using it weakens pruning/strict structural guarantees for that subtree: you lose automated pruning and some client/server validation assumptions. [1][2]
- Overuse can reintroduce the problems structural schemas were meant to solve (inconsistent shapes, harder API evolution/testing). Prefer narrow, well-documented escape-hatches. [1][3]
- For v1 CRDs you still must provide a structural schema overall; preserveUnknownFields at the CRD level is deprecated—prefer the vendor extension on specific schema nodes. [1][3]
References
- Kubernetes CRD structural schema and vendor extensions (x-kubernetes-preserve-unknown-fields). [1]
- Kubernetes API concepts: unknown-field pruning behavior. [2]
- CRD docs: pruning vs preserving unknown fields and migration guidance. [3]
</web_search_result>
Document the schema flexibility and scope of adapter configuration.
This change uses x-kubernetes-preserve-unknown-fields: true to support extensible adapter schemas—a standard Kubernetes pattern for plugin and custom configuration support. However, the PR lacks documentation of the intent and impact.
Before merging, add to the PR description or a code comment:
- Why adapter schema is being made flexible (e.g., support for custom/vendor adapter types)
- Whether
framework,memory,storageUriare still documented and validated elsewhere (tests, runtime, controller logic) - Any migration guidance for users expecting strict schema validation
| - --modelServerMetricsScheme | ||
| - "https" | ||
| - --modelServerMetricsHttpsInsecureSkipVerify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Script executed:
#!/bin/bash
# First, locate and examine the file in question
find . -name "config-llm-scheduler.yaml" -type f | head -20Repository: opendatahub-io/kserve
<shell_result>
Repository: opendatahub-io/kserve
Script execution failed with unrecoverable error: [CAPABILITY:RUN_SHELL_SCRIPT] Failed to clone repository opendatahub-io/kserve: Clone operation failed
</shell_result>
KServe modelServerMetricsHttpsInsecureSkipVerify TLS certificate verification
<web_search_result>
Short answer: That flag (modelServerMetricsHttpsInsecureSkipVerify) — when present in the inferenceservice-config/prometheus-related configuration — tells KServe to skip TLS certificate verification when contacting model servers’ /metrics endpoints over HTTPS. Enabling it disables hostname/CA checks for those metric scrapes (insecure), so the scrapers will accept self-signed/invalid certs. Use with caution; prefer providing a CA bundle (caBundle / caCertFile) instead or properly signed certs. See KServe docs on skipping TLS vs supplying CA bundles and the cluster-wide configmap settings. [1][2]
Sources
- KServe: Configure CA bundle / skip SSL verification for components (describes using CA bundles and the skip-verify option). [1]
- KServe: global inferenceservice-config documentation (shows cluster-wide config and examples for TLS-skip options such as tlsSkipVerify). [2]
</web_search_result>
InsecureSkipVerify production Kubernetes TLS certificate verification best practices security
<web_search_result>
Short answer: Never use InsecureSkipVerify (or equivalent “skip TLS verification” flags) in production. It disables server certificate validation and allows trivial man‑in‑the‑middle, impersonation, and audit‑bypass attacks. (kubernetes.io)
Recommended production best practices
- Use properly issued TLS certificates (public CA for externally reachable endpoints; internal PKI for private clusters). Distribute the issuing CA to clients instead of skipping verification. (kubernetes.io)
- Enforce mutual TLS (mTLS) where appropriate so both client and server verify identities. (kubernetes.io)
- Pin or validate expected server names and certificate chains (hostname verification + verify chain to a trusted CA). Don’t rely on raw certificate bytes without chain validation. (reddit.com)
- Use Kubernetes-native TLS/secret management (cert-manager, in‑cluster CA, or your PKI) and automate certificate issuance/rotation. Validate certs through CI/CD and policy checks. (kubernetes.io)
- Restrict network access to control‑plane endpoints (network policies, firewall rules, private API endpoints) so verified TLS is combined with network controls. (kubernetes.io)
- Audit, logging and alerting: enable API server auditing and monitor for clients using insecure flags or unexpected certificate errors. (kubernetes.io)
- Fail builds/helm charts or block deploys that include insecure flags (InsecureSkipVerify, skip-tls-verify, tls.insecureSkipVerify) via policy engines (e.g., OPA/Gatekeeper). (gateway.envoyproxy.io)
If you think you must temporarily skip verification (development/testing only)
- Limit scope: only in dev/test namespaces or environments isolated from production. (gateway.envoyproxy.io)
- Compensate with strict network isolation and short lifetimes for test credentials. Log and track any use. (hatchjs.com)
- Prefer implementing custom VerifyPeerCertificate that re‑implements validation correctly (advanced, risky — only if you fully understand certificate verification and supply required trust anchors). (reddit.com)
Operational checklist to eliminate InsecureSkipVerify from production
- Inventory code/configs for insecure flags (grep CI, images, helm charts, manifests).
- Replace with trusted CA bundles or mTLS configurations.
- Automate cert issuance/rotation (cert-manager or PKI).
- Add policy checks to CI/CD to block insecure settings.
- Harden network access and enable API auditing/alerting. (kubernetes.io)
If you want, I can:
- scan a repo/manifest set for insecure TLS flags, or
- give concrete kubectl/helm/cert-manager examples to replace InsecureSkipVerify.
Citations:
- 1: https://kubernetes.io/docs/tasks/administer-cluster/securing-a-cluster/?utm_source=openai
- 2: https://kubernetes.io/docs/tasks/administer-cluster/securing-a-cluster/?utm_source=openai
- 3: https://kubernetes.io/docs/tasks/administer-cluster/securing-a-cluster/?utm_source=openai
- 4: https://www.reddit.com/r/golang/comments/17c3uwx?utm_source=openai
- 5: https://kubernetes.io/docs/tasks/administer-cluster/securing-a-cluster/?utm_source=openai
- 6: https://kubernetes.io/docs/tasks/administer-cluster/securing-a-cluster/?utm_source=openai
- 7: https://kubernetes.io/docs/tasks/administer-cluster/securing-a-cluster/?utm_source=openai
- 8: https://gateway.envoyproxy.io/latest/tasks/security/backend-skip-tls-verification/?utm_source=openai
- 9: https://gateway.envoyproxy.io/latest/tasks/security/backend-skip-tls-verification/?utm_source=openai
- 10: https://hatchjs.com/kubectl-skip-tls-verify/?utm_source=openai
- 11: https://www.reddit.com/r/golang/comments/17c3uwx?utm_source=openai
- 12: https://kubernetes.io/docs/tasks/administer-cluster/securing-a-cluster/?utm_source=openai
</web_search_result>
Based on my verification, the security concern in the original review comment is legitimate and accurate. Industry best practices (Kubernetes.io, KServe documentation, and security standards) explicitly recommend against using InsecureSkipVerify-style flags in production, as they disable certificate validation and enable man-in-the-middle attacks.
Here is the rewritten review comment:
Remove or conditionally scope --modelServerMetricsHttpsInsecureSkipVerify for production.
The scheduler disables TLS certificate verification for model server metrics over HTTPS. This creates a security vulnerability in production by allowing trivial man-in-the-middle attacks. Kubernetes and KServe best practices explicitly recommend against this.
Recommended fixes:
- Provide a CA bundle via KServe's
caBundle/caCertFileconfiguration instead, or - Use properly issued TLS certificates and enable verification, or
- If this flag must remain (development/testing only), restrict it to dev/test namespaces via Helm values and document the exception.
- --modelServerMetricsScheme
- "https"
- --modelServerMetricsHttpsInsecureSkipVerify
Reference: KServe TLS configuration documentation.
🤖 Prompt for AI Agents
charts/llmisvc-resources/templates/config-llm-scheduler.yaml lines 62-64: the
template currently passes --modelServerMetricsHttpsInsecureSkipVerify which
disables TLS certificate verification; remove this flag from the production
template and instead (a) configure and mount a CA bundle or set the KServe
caCertFile/caBundle so metric scraping uses valid certs, or (b) require properly
issued TLS certificates on the model server so verification succeeds, and if you
must keep the flag for development only, add a Helm value (e.g.,
metrics.insecureSkipVerify=false by default) and conditionalize the flag in the
template so it is only rendered for non-production deployments with clear
documentation.
| template: | ||
| containers: | ||
| - image: ghcr.io/llm-d/llm-d:v0.2.0 | ||
| - image: ghcr.io/llm-d/llm-d-dev:v0.2.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Script executed:
#!/bin/bash
# Find all image references in the repository to understand versioning pattern
rg "ghcr\.io/llm-d" -n -C 2Repository: opendatahub-io/kserve
<shell_result>
Repository: opendatahub-io/kserve
Script execution failed with unrecoverable error: [CAPABILITY:RUN_SHELL_SCRIPT] Failed to clone repository opendatahub-io/kserve: Clone operation failed
</shell_result>
Script executed:
#!/bin/bash
# Check git history of this specific file to see previous image versions
cd "$(git rev-parse --show-toplevel)" 2>/dev/null || pwd
git log -p --follow -S "ghcr.io/llm-d" -- "charts/llmisvc-resources/templates/config-llm-worker-data-parallel.yaml" | head -200Repository: opendatahub-io/kserve
<shell_result>
Repository: opendatahub-io/kserve
Script execution failed with unrecoverable error: [CAPABILITY:RUN_SHELL_SCRIPT] Failed to clone repository opendatahub-io/kserve: Clone operation failed
</shell_result>
Script executed:
#!/bin/bash
# Search for any documentation or configuration about image versions
rg -i "(image|version|dev|release)" charts/llmisvc-resources/ -A 2 -B 2 | head -100Repository: opendatahub-io/kserve
<shell_result>
Repository: opendatahub-io/kserve
Script execution failed with unrecoverable error: [CAPABILITY:RUN_SHELL_SCRIPT] Failed to clone repository opendatahub-io/kserve: Clone operation failed
</shell_result>
Script executed:
#!/bin/bash
# Look for values.yaml or default configurations that might specify image versions
find charts/llmisvc-resources -name "values.yaml" -o -name "Chart.yaml" -o -name "README*" | xargs cat 2>/dev/nullRepository: opendatahub-io/kserve
<shell_result>
Repository: opendatahub-io/kserve
Script execution failed with unrecoverable error: [CAPABILITY:RUN_SHELL_SCRIPT] Failed to clone repository opendatahub-io/kserve: Clone operation failed
</shell_result>
ghcr.io llm-d llm-d-dev image versions releases
<web_search_result>
Summary — llm-d images and releases (GHCR)
-
Official project release: llm-d v0.3.0 (release date Oct 10, 2025). Component images published under ghcr.io/llm-d (examples: ghcr.io/llm-d/llm-d-cuda:v0.3.0, ghcr.io/llm-d/llm-d-gke:v0.3.0). [1]
-
Older published image tag example: ghcr.io/llm-d/llm-d:v0.2.0 (image metadata / digest available). [2]
-
Many subprojects use a "dev" image tag for local/dev builds (e.g., ghcr.io/llm-d/llm-d-inference-sim:dev) and documentation notes replacing :dev with the current release tag for stable usage. [3]
If you want, I can:
- list all GHCR image tags for a specific llm-d repository (requires querying the registry), or
- fetch the full release changelog and per-repo image tags. Which would you prefer?
Sources:
[1] llm-d GitHub Releases (v0.3.0).
[2] ghcr.io/llm-d/llm-d:v0.2.0 image metadata mirror.
[3] llm-d/llm-d-inference-sim README (notes about :dev and release tags).
</web_search_result>
Use a stable release tag instead of the -dev variant, or confirm this is intentional staging.
The web search confirms that the llm-d project documentation explicitly recommends replacing :dev image tags with current release tags for stable deployments. The -dev variants are designated for local/development builds.
Additionally, the project has published a v0.3.0 stable release (Oct 2025), which is newer than the v0.2.2 variant referenced here. Using ghcr.io/llm-d/llm-d-dev:v0.2.2 in a production Kubernetes template deviates from project best practices and introduces risk of using development-stage code.
Recommended action: Switch to a stable release tag (e.g., ghcr.io/llm-d/llm-d:v0.2.2 or consider upgrading to v0.3.0), or explicitly document this as a staging/testing deployment.
Also applies to: 106-106
| { | ||
| name: "merge pod spec, add container", | ||
| cfgs: []v1alpha1.LLMInferenceServiceSpec{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate test case name.
Test case at line 1251 has the same name "merge pod spec, add container" as the test case at line 1190. This can cause confusion when debugging test failures and may mask issues if only one is failing.
{
- name: "merge pod spec, add container",
+ name: "merge pod spec, add container with empty override",
cfgs: []v1alpha1.LLMInferenceServiceSpec{📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| { | |
| name: "merge pod spec, add container", | |
| cfgs: []v1alpha1.LLMInferenceServiceSpec{ | |
| { | |
| name: "merge pod spec, add container with empty override", | |
| cfgs: []v1alpha1.LLMInferenceServiceSpec{ |
🤖 Prompt for AI Agents
In pkg/controller/v1alpha1/llmisvc/config_merge_test.go around lines 1250-1252
(the test case starting at line 1251), there is a duplicate test case name
"merge pod spec, add container" which is identical to the one at line 1190;
rename this test case to a unique, descriptive name (e.g., "merge pod spec, add
container - new container" or "merge pod spec, add container (with
initContainer)") so t.Run outputs are distinct, and update any related
expectations or references that rely on the test name.
| COPY --from=builder --chown=kserve:kserve third_party third_party | ||
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | ||
| COPY --from=builder kserve kserve | ||
| COPY --from=builder storage storage | ||
| COPY --from=builder paddleserver paddleserver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent --chown flags on copied directories.
Lines 67-68 apply --chown=kserve:kserve to third_party and $VIRTUAL_ENV, but lines 69-70 omit the flag for kserve and storage. Since the container runs as non-root user kserve (line 65), all directories should consistently use --chown=kserve:kserve to avoid permission issues at runtime.
Apply this diff to add missing --chown flags:
COPY --from=builder --chown=kserve:kserve third_party third_party
COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV
-COPY --from=builder kserve kserve
-COPY --from=builder storage storage
+COPY --from=builder --chown=kserve:kserve kserve kserve
+COPY --from=builder --chown=kserve:kserve storage storage📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| COPY --from=builder --chown=kserve:kserve third_party third_party | |
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | |
| COPY --from=builder kserve kserve | |
| COPY --from=builder storage storage | |
| COPY --from=builder paddleserver paddleserver | |
| COPY --from=builder --chown=kserve:kserve third_party third_party | |
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | |
| COPY --from=builder --chown=kserve:kserve kserve kserve | |
| COPY --from=builder --chown=kserve:kserve storage storage | |
| COPY --from=builder paddleserver paddleserver |
🤖 Prompt for AI Agents
In python/paddle.Dockerfile around lines 67 to 71, the COPY lines for kserve,
storage, and paddleserver omit the --chown=kserve:kserve flag while other copies
set it, causing inconsistent ownership for a container that runs as non-root
kserve; update the COPY commands for kserve, storage, and paddleserver to
include --chown=kserve:kserve so all copied directories/files are owned by
kserve:kserve and avoid runtime permission issues.
| COPY --from=builder --chown=kserve:kserve third_party third_party | ||
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | ||
| COPY --from=builder kserve kserve | ||
| COPY --from=builder storage storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent --chown flags on copied directories.
Line 78-79 applies --chown=kserve:kserve to third_party and $VIRTUAL_ENV, but lines 80-81 omit the flag for kserve and storage. Since the container runs as non-root user kserve (line 84), all directories should consistently use --chown=kserve:kserve to avoid permission issues at runtime.
Apply this diff to add missing --chown flags:
COPY --from=builder --chown=kserve:kserve third_party third_party
COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV
-COPY --from=builder kserve kserve
-COPY --from=builder storage storage
+COPY --from=builder --chown=kserve:kserve kserve kserve
+COPY --from=builder --chown=kserve:kserve storage storage📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| COPY --from=builder --chown=kserve:kserve third_party third_party | |
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | |
| COPY --from=builder kserve kserve | |
| COPY --from=builder storage storage | |
| COPY --from=builder --chown=kserve:kserve third_party third_party | |
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | |
| COPY --from=builder --chown=kserve:kserve kserve kserve | |
| COPY --from=builder --chown=kserve:kserve storage storage |
🤖 Prompt for AI Agents
In python/pmml.Dockerfile around lines 78 to 81, the COPY instructions are
inconsistent: third_party and $VIRTUAL_ENV use --chown=kserve:kserve but kserve
and storage do not, which can cause permission errors when running as non-root
user kserve; update the two COPY lines for kserve and storage to include
--chown=kserve:kserve so all copied directories are owned by kserve:kserve at
build time.
| COPY --from=builder --chown=kserve:kserve third_party third_party | ||
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | ||
| COPY --from=builder kserve kserve | ||
| COPY --from=builder storage storage | ||
| COPY --from=builder sklearnserver sklearnserver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent --chown flags on copied directories.
Lines 62-63 apply --chown=kserve:kserve to third_party and $VIRTUAL_ENV, but lines 64-65 omit the flag for kserve and storage. Since the container runs as non-root user kserve (line 60), all directories should consistently use --chown=kserve:kserve to avoid permission issues at runtime.
Apply this diff to add missing --chown flags:
COPY --from=builder --chown=kserve:kserve third_party third_party
COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV
-COPY --from=builder kserve kserve
-COPY --from=builder storage storage
+COPY --from=builder --chown=kserve:kserve kserve kserve
+COPY --from=builder --chown=kserve:kserve storage storage📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| COPY --from=builder --chown=kserve:kserve third_party third_party | |
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | |
| COPY --from=builder kserve kserve | |
| COPY --from=builder storage storage | |
| COPY --from=builder sklearnserver sklearnserver | |
| COPY --from=builder --chown=kserve:kserve third_party third_party | |
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | |
| COPY --from=builder --chown=kserve:kserve kserve kserve | |
| COPY --from=builder --chown=kserve:kserve storage storage | |
| COPY --from=builder sklearnserver sklearnserver |
🤖 Prompt for AI Agents
In python/sklearn.Dockerfile around lines 62-66, several COPY lines lack the
--chown=kserve:kserve flag causing inconsistent ownership; update the COPY
commands that omit --chown (COPY --from=builder kserve kserve, COPY
--from=builder storage storage, and COPY --from=builder sklearnserver
sklearnserver) to include --chown=kserve:kserve so all copied directories are
owned by the non-root kserve user.
| COPY --from=builder --chown=kserve:kserve third_party third_party | ||
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | ||
| COPY --from=builder kserve kserve | ||
| COPY --from=builder storage storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add missing --chown flag for consistency.
Line 68 is missing the --chown=kserve:kserve flag that's applied to other copied artifacts (lines 66–67). Since the kserve user is set as the active user at line 77, the storage directory should have consistent ownership to avoid potential permission issues at runtime.
-COPY --from=builder storage storage
+COPY --from=builder --chown=kserve:kserve storage storage📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| COPY --from=builder storage storage | |
| COPY --from=builder --chown=kserve:kserve storage storage |
🤖 Prompt for AI Agents
In python/storage-initializer.Dockerfile around line 68, the COPY of the storage
directory from the builder stage lacks the --chown=kserve:kserve flag used for
other artifacts; update that COPY instruction to include --chown=kserve:kserve
so the storage directory is owned by the kserve user (matching lines 66–67) to
avoid permission issues when USER kserve is set later.
| COPY storage/pyproject.toml storage/uv.lock storage/ | ||
| RUN cd storage && uv sync --active --no-cache | ||
|
|
||
| COPY storage storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove trailing whitespace.
Line 31 has unnecessary trailing whitespace after the COPY instruction.
Apply this diff:
-COPY storage storage
+COPY storage storage📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| COPY storage storage | |
| COPY storage storage |
🤖 Prompt for AI Agents
In python/xgb.Dockerfile around line 31, the COPY instruction has trailing
whitespace; remove the extra space at the end of the line so it reads without
any trailing whitespace (ensure the line ends immediately after "COPY storage
storage").
| COPY --from=builder --chown=kserve:kserve third_party third_party | ||
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | ||
| COPY --from=builder kserve kserve | ||
| COPY --from=builder storage storage | ||
| COPY --from=builder xgbserver xgbserver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent --chown flags on copied directories.
Lines 63-64 apply --chown=kserve:kserve to third_party and $VIRTUAL_ENV, but lines 65-66 omit the flag for kserve and storage. Since the container runs as non-root user kserve (line 61), all directories should consistently use --chown=kserve:kserve to avoid permission issues at runtime.
Apply this diff to add missing --chown flags:
COPY --from=builder --chown=kserve:kserve third_party third_party
COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV
-COPY --from=builder kserve kserve
-COPY --from=builder storage storage
+COPY --from=builder --chown=kserve:kserve kserve kserve
+COPY --from=builder --chown=kserve:kserve storage storage📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| COPY --from=builder --chown=kserve:kserve third_party third_party | |
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | |
| COPY --from=builder kserve kserve | |
| COPY --from=builder storage storage | |
| COPY --from=builder xgbserver xgbserver | |
| COPY --from=builder --chown=kserve:kserve third_party third_party | |
| COPY --from=builder --chown=kserve:kserve $VIRTUAL_ENV $VIRTUAL_ENV | |
| COPY --from=builder --chown=kserve:kserve kserve kserve | |
| COPY --from=builder --chown=kserve:kserve storage storage | |
| COPY --from=builder xgbserver xgbserver |
🤖 Prompt for AI Agents
In python/xgb.Dockerfile around lines 63 to 67, the COPY lines are inconsistent:
third_party and $VIRTUAL_ENV use --chown=kserve:kserve but kserve, storage, and
xgbserver do not, which can cause permission issues when running as non-root
kserve; update the COPY instructions for kserve, storage, and xgbserver to
include --chown=kserve:kserve so all copied directories are owned by
kserve:kserve at build time and avoid runtime permission errors.
|
@mholder6: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #
Type of changes
Please delete options that are not relevant.
Feature/Issue validation/testing:
Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Test A
Test B
Logs
Special notes for your reviewer:
Checklist:
Release note:
Re-running failed tests
/rerun-all- rerun all failed workflows./rerun-workflow <workflow name>- rerun a specific failed workflow. Only one workflow name can be specified. Multiple /rerun-workflow commands are allowed per comment.Summary by CodeRabbit
New Features
Bug Fixes
Refactor
kserve_storagemodule.Chores
✏️ Tip: You can customize this high-level summary in your review settings.