Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue fetching external metric #642

Open
tusharInferQ opened this issue Feb 20, 2024 · 1 comment
Open

Issue fetching external metric #642

tusharInferQ opened this issue Feb 20, 2024 · 1 comment
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@tusharInferQ
Copy link

What happened?:

I'm trying to implement HPA with a custom metric from my application. I am able to query the metric via curl, and I am also able to view the metric in the Prometheus UI. However, when i describe my HPA, it gives me the error as follows:

Warning FailedComputeMetricsReplicas 18m (x12 over 21m) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get active_connections_total external metric value: failed to get active_connections_total external metric: unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io) Warning FailedGetExternalMetric 94s (x81 over 21m) horizontal-pod-autoscaler unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io) Warning FailedGetExternalMetric <invalid> (x2 over <invalid>) horizontal-pod-autoscaler unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io) Warning FailedComputeMetricsReplicas <invalid> (x2 over <invalid>) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get active_connections_total external metric value: failed to get active_connections_total external metric: unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io)

What did you expect to happen?:

I expected that my HPA would be able to receive the current number of active_connections_total (my external metric), and accordingly scale pods up or down.

Please provide the prometheus-adapter config:

apiVersion: v1
kind: ConfigMap
metadata:
  name: adapter-config
  namespace: prometheus
data:
  config.yaml: |-
    rules:
    - seriesQuery: |
        active_connections_total
      resources:
        template: pod
      name:
        matches: "^(.*)_total"
        as: "$1"
      metricsQuery: |
        sum by (app) (
          active_connections_total{app="eamm"}
        )

Please provide the HPA resource used for autoscaling:

HPA yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-connection-based
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: eamm-deployment-v1
  minReplicas: 1
  maxReplicas: 4
  metrics:
  - type: External
    external:
      metric:
        name: active_connections_total
      target:
        type: Value
        averageValue: 1

Please provide the HPA status:

Warning FailedComputeMetricsReplicas 18m (x12 over 21m) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get active_connections_total external metric value: failed to get active_connections_total external metric: unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io) Warning FailedGetExternalMetric 94s (x81 over 21m) horizontal-pod-autoscaler unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io) Warning FailedGetExternalMetric <invalid> (x2 over <invalid>) horizontal-pod-autoscaler unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io) Warning FailedComputeMetricsReplicas <invalid> (x2 over <invalid>) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get active_connections_total external metric value: failed to get active_connections_total external metric: unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io)

Please provide the prometheus-adapter logs with -v=6 around the time the issue happened:

prometheus-adapter logs

I0220 09:37:35.228067 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="570.664µs" userAgent="Go-http-client/2.0" audit-ID="8c6994a7-d7dc-48bc-9b96-247bb5d3afb3" srcIP="192.168.149.219:50060" resp=200 I0220 09:37:35.228124 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="713.014µs" userAgent="Go-http-client/2.0" audit-ID="87e85bdb-0ede-46e6-8d70-bdd792ba15b4" srcIP="192.168.149.219:50060" resp=200 I0220 09:37:35.228140 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="756.121µs" userAgent="Go-http-client/2.0" audit-ID="51d7804b-673d-44a4-bf8b-23c0753943fc" srcIP="192.168.149.219:50060" resp=200 I0220 09:37:35.228140 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="732.822µs" userAgent="Go-http-client/2.0" audit-ID="b41a1e37-88d9-4295-9f60-7959c1ef15f2" srcIP="192.168.149.219:50060" resp=200 I0220 09:37:35.228279 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="659.56µs" userAgent="Go-http-client/2.0" audit-ID="886ff402-e061-4477-adfa-21661f9d4108" srcIP="192.168.149.219:50060" resp=200 I0220 09:37:38.178959 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="182.979µs" userAgent="kube-probe/1.29+" audit-ID="22264f9a-4fe7-497f-94aa-2bbc8f7a7610" srcIP="192.168.58.94:53526" resp=200 I0220 09:37:38.179392 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="164.012µs" userAgent="kube-probe/1.29+" audit-ID="b5e922a1-9851-4154-a479-992efd76b100" srcIP="192.168.58.94:53524" resp=200 I0220 09:37:40.503400 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="502.409µs" userAgent="Go-http-client/2.0" audit-ID="6caf0e14-77f1-4b5b-9020-9c52b59d55f0" srcIP="192.168.182.198:59732" resp=200 I0220 09:37:40.503638 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="524.354µs" userAgent="Go-http-client/2.0" audit-ID="f69db820-2a6c-45d7-b6c3-f1fb9732cc67" srcIP="192.168.182.198:59732" resp=200 I0220 09:37:40.503712 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="596.033µs" userAgent="Go-http-client/2.0" audit-ID="4b7b0cf3-7279-4642-a071-d26482c1fdd8" srcIP="192.168.182.198:59732" resp=200 I0220 09:37:40.503980 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="603.228µs" userAgent="Go-http-client/2.0" audit-ID="701b8235-e921-405d-a2e6-e77cfc501435" srcIP="192.168.182.198:59732" resp=200 I0220 09:37:40.504956 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="479.687µs" userAgent="Go-http-client/2.0" audit-ID="f00a427d-3742-4ad2-b3a4-d1a5e40f23f0" srcIP="192.168.182.198:59732" resp=200 I0220 09:37:48.179353 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="183.054µs" userAgent="kube-probe/1.29+" audit-ID="91b0c6be-d1e0-4217-b146-67d558c7c963" srcIP="192.168.58.94:38734" resp=200 I0220 09:37:48.179771 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="144.857µs" userAgent="kube-probe/1.29+" audit-ID="274e8954-ba95-4b84-a52a-b4307c60b291" srcIP="192.168.58.94:38736" resp=200 I0220 09:37:49.783147 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/http_requests?labelSelector=app%3Dsample-app" latency="12.029261ms" userAgent="kube-controller-manager/v1.29.0 (linux/amd64) kubernetes/787475c/system:serviceaccount:kube-system:horizontal-pod-autoscaler" audit-ID="c56c906f-224f-4d0b-9e72-b3ce12d7e816" srcIP="192.168.149.219:50062" resp=404 I0220 09:37:58.178718 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="187.6µs" userAgent="kube-probe/1.29+" audit-ID="20f67fcb-8295-4d62-a8d8-42def158fc4c" srcIP="192.168.58.94:53796" resp=200 I0220 09:37:58.179198 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="225.774µs" userAgent="kube-probe/1.29+" audit-ID="7998a065-1836-4983-b792-557af3cbddf7" srcIP="192.168.58.94:53794" resp=200

Anything else we need to know?:

Environment:

  • prometheus-adapter version: 0.11.2

  • prometheus version: 2.30.3

  • Kubernetes versionClient Version: v1.28.6, Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3, Server Version: v1.29.0-eks-c417bb3

  • Cloud provider or hardware configuration: AWS EKS

  • Other info:

@tusharInferQ tusharInferQ added the kind/bug Categorizes issue or PR as related to a bug. label Feb 20, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Feb 20, 2024
@dashpole
Copy link

/assign @dgrisonnet
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants