Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proposal for integrating Bridge with Metrics Reporter #143

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
117 changes: 117 additions & 0 deletions 090-integrate-bridge-with-metrics-reporter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Integrate Bridge with Metrics Reporter

In [SIP-064](https://github.com/strimzi/proposals/blob/main/064-prometheus-metrics-reporter.md) we introduced the [Strimzi Metrics Reporter](https://github.com/strimzi/metrics-reporter).
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
This is a Kafka MetricsReporter plugin that directly exposes metrics in Prometheus format via an HTTP endpoint.

## Current situation
fvaleri marked this conversation as resolved.
Show resolved Hide resolved

The HTTP Bridge allows users to enable or disable the metrics endpoint using the `KAFKA_BRIDGE_METRICS_ENABLED` environment variable.
When this variable is set to true, the Bridge creates the JMX Exporter's JmxCollector class with hard coded configuration.
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
This configuration is a JMX metrics exporter YAML distributed as an embedded resource within the Bridge's JAR file.

When deploying the Bridge through the Cluster Operator, a similar configuration is available in the KafkaBridge's CRD:
fvaleri marked this conversation as resolved.
Show resolved Hide resolved

```sh
spec:
enableMetrics: true
```

This is different from how the metrics endpoint is enabled in all the other major components deployed by the Cluster Operator.
The Cluster Operator exposes metrics through the [Prometheus JMX Exporter](https://github.com/prometheus/jmx_exporter), which can be configured using the shared `metricsConfig` schema.
This schema has a type property that only allows the `jmxPrometheusExporter` value, and a reference to a ConfigMap containing its configuration:

```sh
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: $COMPONENT_NAME-metrics
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
key: metrics-config.yml
```

At runtime, the above configuration enables a Java agent which exposes Kafka JMX metrics in Prometheus format through an HTTP endpoint on port 9404.
Note that this agent depends on the [Kafka JMX Reporter](https://github.com/apache/kafka/blob/3.9.0/clients/src/main/java/org/apache/kafka/common/metrics/JmxReporter.java) plugin, which is enabled by default.
fvaleri marked this conversation as resolved.
Show resolved Hide resolved

## Motivation

We want to support the Strimzi Metrics Reporter as an alternative way of configuring metrics across all components.
When deploying configuring major components through the Cluster Operator, we want to provide a consistent user experience, so the Bridge should also support the `metricsConfig` schema.

## Proposal

The Bridge will support both JMX Prometheus Exporter and Strimzi Metrics Reporter as metrics configuration types.
It will also provide new configurations for the integration with Cluster Operator.
fvaleri marked this conversation as resolved.
Show resolved Hide resolved

### Metrics configuration types

The Bridge project will be updated to support multiple metrics configurations types.
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
A new `bridge.metrics` property will be available within the application.properties configuration file, and will only accept `jmxPrometheusExporter` and `strimziMetricsReporter`.
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
Any other value will raise an error and the application will fail to start with an appropriate error message.

When running in standalone mode with `strimziMetricsReporter`, the user will be able to configure any reporter property using the "kafka." prefix.
The following example will be provided as a comment in the default application.properties file:

```sh
bridge.metrics=strimziMetricsReporter
kafka.metric.reporters=io.strimzi.kafka.metrics.KafkaPrometheusMetricsReporter
kafka.prometheus.metrics.reporter.listener.enable=false
kafka.prometheus.metrics.reporter.allowlist=.*
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
```

The `KAFKA_BRIDGE_METRICS_ENABLED` environment variable will be deprecated and removed in a future release.
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
When set, the user will get a warning suggesting to use the `bridge.metrics` property.
In case they are both set, `bridge.metrics` will take precedence over `KAFKA_BRIDGE_METRICS_ENABLED`.

The MetricsReporter class will be updated to also include a new StrimziCollectorRegistry that will work similarly to the JmxCollectorRegistry.
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
The StrimziCollectorRegistry will include a reference to PrometheusRegistry.defaultRegistry, which is the same instance used by the Strimzi Metrics Reporter to collect metrics.

The kafka_bridge_config_generator.sh script is used to generate the image configuration based on environment variables.
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
This script will be also updated to include `bridge.metrics` and related configurations (see the following section).
fvaleri marked this conversation as resolved.
Show resolved Hide resolved

The Bridge will try to load the JMX Prometheus Exporter configuration file from the path specified by the `bridge.metrics.jmx.exporter.config.path` property.
If the property is not specified or the file is not found, the Bridge will fall back to the hard coded configuration.
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
This feature is not strictly required to support the Strimzi Metrics Reporter, but will be used by the Cluster Operator.

### Cluster Operator integration
fvaleri marked this conversation as resolved.
Show resolved Hide resolved

The KafkaBridge's CRD will also support `metricsConfig` with the addition of `strimziMetricsReporter` type.
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
This is how the Strimzi Reporter configuration will look like:
fvaleri marked this conversation as resolved.
Show resolved Hide resolved

```sh
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
spec:
metricsConfig:
type: strimziMetricsReporter
values:
allowList:
- "kafka_log.*"
- "kafka_network.*"
fvaleri marked this conversation as resolved.
Show resolved Hide resolved
```

Three new environment variables will be introduced to pass the metrics configuration to the Bridge's container:

- `STRIMZI_METRICS`: This will contain the `metricsConfig` types to enable (one of `jmxPrometheusExporter` and `strimziMetricsReporter`).
- `KAFKA_BRIDGE_METRICS_JMX_CONFIG`: Used with JMX Prometheus Exporter to pass the configuration file path.
- `KAFKA_BRIDGE_METRICS_SMR_CONFIG`: Used with Strimzi Metrics Reporter to pass the plugin configuration.

The `enableMetrics` property will be deprecated and removed in a future release.
When set, the user will get a warning suggesting to use the `metricsConfig` configuration.
In case they are both set, `metricsConfig` will take precedence over `enableMetrics`.

The JMX Prometheus Exporter configuration file will be stored in a ConfigMap and mounted in the Bridge's container.
The full configuration file passed to the Bridge's container will be `/opt/strimzi/custom-config/metrics-config.yml`.

Metrics will be exposed through the Bridge's HTTP server, so the Strimzi Metrics Reporter's listener will be disabled.
Other configurations will be locked down with the exception of the `prometheus.metrics.reporter.allowlist` property.

## Affected/not affected projects

The affected projects are Cluster Operator and Kafka Bridge.

## Compatibility

All changes will be backwards compatible, but there will be some deprecations as detailed above.

## Rejected alternatives

Lock down and automate the Strimzi Metrics Reporter configurations for the standalone Bridge.
This wa rejected because the Bridge allows user to customize Kafka clients and plugins using the application.properties file, which includes commented examples.