Skip to content

[bitnami/mongodb] Enabling TLS using a custom certificate runs into multiple issues. - user creation / current_primary detection / all probes #16719

@dtrts

Description

@dtrts

Name and Version

bitnami/mongodb 13.9.4

What architecture are you using?

amd64

What steps will reproduce the bug?

  1. Create a TLS certificate with a single wildcard domain. *.mongodb.example.com
  2. Enable TLS in the chart with requireTLS.

a) The root users are unable to be created during the startup
b) New nodes are unable to register with the primary host
c) The probes fail to run commands
d) The metrics server is unable to connect to mongod

The root cause of this issue is being restricted by a TLS Certificate which can only have a single domain.

The certificate cannot have 127.0.0.1, localhost or any of the k8s domains on there.

Are you using any custom parameters or values?

tls: 
  enabled: true 
  standalone/replicaset/hidden/arbiter:
  existingSecret(s): [ "mongodb-example-com-cert" ]
externalAccess: 
  enabled: true
  autoDiscovery: 
    enabled: true
  annotations:
      external-dns.alpha.kubernetes.io/hostname: "{{ .targetPod }}.mongodb.example.com"
  hidden:
    enabled: true
    service:
      annotations:
        external-dns.alpha.kubernetes.io/hostname: "{{ .targetPod }}.mongodb.example.com"

SIDE NOTE: We have enabled autodiscovery to mount the /shared volume and populate info.txt.
Our in use DNS address does not match the value returned by the autoDiscovery script since we are using the external-dns won AWS.
Our FQDN is releasename-0.mongodb.example.com and autoDiscovery returns the AWS load balancer address.
If we switch auto-discovery off then the MONGODB_ADVERTISED_HOSTNAME reverts to the loadbalancer IPs which we do not want.


What is the expected behavior?

The mongodb database initializes correctly:

  • root user and additional users are created
  • replicaset config is configured and nodes join existing replica sets.

The probes are able to connect to mongod using tls
The metrics server connects to mongod using tls

What do you see instead?

  • All commands of mongosh time out since the TLS requirements are not met.
  • The script continues, configuring mongo.conf and restarting the mongod server.
  • Mongod is then running but without any users or a replicaset configured.

Additional information

We have implemented significant workarounds to get the chart working. I think these are good candidates for being included in the Chart (and additionally the Container)

Below I will describe the various points where the restrictions


Current Primary / Mongodb Advertised Hostname / Mongodb Initial Primary Host.

Description

The MONGODB_ADVERTISED_HOSTNAME envar is populated through autoDiscovery.sh and takes a default through the container envvars. It is used as the hostname when configuring the replica set and also used to compare against the hostname returned from an existing replica set to verify if the current node is a primary.

The MONGODB_INITIAL_PRIMARY_HOST env var is used as the host when configuring a new secondary node onto an existing replicaset.

The current_primary variable is used in the setup.sh script and populated through a mongosh function. The chart only uses the hostnames for the headless service.

The environment variables can be overwritten through the chart extraEnvVars: but this is not sufficient.
MONGODB_ADVERTISED_HOSTNAME is overwritten in the setup.sh script, either through the autoDiscover.sh initi container or using other details for the load balancer.

Issues

All three of these variables cause issues with our TLS set-up.

The current_primary node is unable to be found. The replicaset configuration is unable to be set, and even if it was able to connect a hostname would be added which would also fail TLS.

Workaround

First we override the environment variables with our domain name:

extraEnvVars:
      - name: MONGODB_INITIAL_PRIMARY_HOST
        value: {{ printf "%s-0.%s" (include "mongodb.fullname" .) "mongodb.example.com" }}

      - name: MONGODB_ADVERTISED_HOSTNAME
        value: $(MY_POD_NAME).mongodb.example.com

This sets the defaults to use our Valid Domain name and will ensure that the replicaset config can be done.

We alter the args: of the container to run a preparatory script. This script runs first to perform some small changes, it also change the /scripts/setup.sh file, before running the altered script

The preparatory script first inserts our Valid FQDN for that pod into the response from the initContainer.

    echo -n "$MONGODB_ADVERTISED_HOSTNAME" > /shared/info.txt

Next it takes /scripts/setip.sh and replaces the connection string for the current_primary variable with our preferred host list.
NOTE: This requires us to mount an additional emptyDir volume since we cannot alter the script in place

    {{- $replicaCount := int .Values.replicaCount }}
    {{- $portNumber := int .Values.service.ports.mongodb }}
    {{- $fullname := include "mongodb.fullname" . }}
    {{- $releaseNamespace := include "mongodb.namespace" . }}
    {{- $clusterDomain := .Values.mongodb.clusterDomain }}
    {{- $loadBalancerIPListLength := len .Values.externalAccess.service.loadBalancerIPs }}
    {{- $mongoList := list }}
    {{- $mongoListTLS := list }} <----------  New line added to construct connection string to tls addresses.
    {{- range $e, $i := until $replicaCount }}
    {{- $mongoList = append $mongoList (printf "%s-%d.%s-headless.%s.svc.%s:%d" $fullname $i $fullname $releaseNamespace $clusterDomain $portNumber) }}
    {{- $mongoListTLS = append $mongoListTLS (printf "%s-%d.%s:%d" $fullname $i "mongodb.example.com" $portNumber) }} #### <---------- New line added to construct connection string to tls addresses
    {{- end }}

    CONNECTION_STRING_OLD="\"{{ join "," $mongoList }}\""
    CONNECTION_STRING_NEW="\"{{ join "," $mongoListTLS }}\""

    CONNECTION_STRING_NEW="$CONNECTION_STRING_NEW --tls --tlsCertificateKeyFile=/certs/mongodb.pem --tlsCAFile=/certs/mongodb-ca-cert"

    info "Old Connection String: $CONNECTION_STRING_OLD"
    info "New Connection String: $CONNECTION_STRING_NEW"

    for sourceFile in "/scripts/setup.sh" "/scripts/setup-hidden.sh"; do
      if [[ -f "$sourceFile" ]]; then
        destFile="/alteredScripts/$(basename -- "$sourceFile")"
        info "Amending $sourceFile and moving to $destFile..."

        ESCAPED_CONNECTION_STRING_OLD=$(printf '%s\n' "$CONNECTION_STRING_OLD" | sed -e 's/[]\/$*.^[]/\\&/g');
        ESCAPED_CONNECTION_STRING_NEW=$(printf '%s\n' "$CONNECTION_STRING_NEW" | sed -e 's/[\/&]/\\&/g')

        sed "s/$ESCAPED_CONNECTION_STRING_OLD/$ESCAPED_CONNECTION_STRING_NEW/g" $sourceFile > $destFile

        chmod 0755 "$destFile"
      fi
    done

We can now run /alteredScripts/setup.sh or /alteredScripts/setup-hidden.sh as required.

We have alsorequired a wait to ensure the hostnames we want to reach are ready.

    for mongodb_host in "${MONGODB_INITIAL_PRIMARY_HOST}" "${MONGODB_ADVERTISED_HOSTNAME}"; do
      info "Testing host: $mongodb_host..."
      for i in {1..180}; do
        if getent ahosts "$mongodb_host"; then
          break
        fi
        sleep 1
        debug "Waiting for $mongodb_host to be ready... $i"
        if [ "$i" == "180" ]; then
          warn "Unable to connect to host: $mongodb_host"
          exit 1
        fi
      done
    done

This waits for 3 minutes, if It still fails the pod stops and restarts.
We have increased the startupProdeDelay by 2 minutes to avoid a CrashLoopBackoff from slowing down the startup.
(Perhaps a better balance can be found between the wait in the pod and the risk of triggering a long crashLoopBackoff?)

BONUS FEATURE - Attach to existing replicaSet

In our prep script we have also added an option to completely overwrite the connection string for current_primary using a hardcoded string from the values.

    {{- if .Values.global.existingReplicaSetConnectionString }}
    CONNECTION_STRING_NEW="{{ .Values.global.existingReplicaSetConnectionString }}\""
    {{- end }}

Ensuring that the replicaset name, key and connection string match an existing replicaset it will deploy all nodes as secondaries for that replica set.

We are using this for migration purposes.
Once the data has been synced we can move the primary to the cluster, update our connection strings, remove the additional value and destroy the old nodes.

Suggested Chart Changes

To implement similar in the chart some / all of these changes could be made:

  • Provide a method to manually set MONGODB_ADVERTISED_HOSTNAME even when autoDiscovery is disabled
    • OR
    • provide a way to customise autoDiscovery so that it can take/output our custom domain. This initContainer could also verify the DNS address is reachable.
  • Provide a method to change the structure of the hosts generated in the host list for populating current_primary. This way we can guarantee a TLS connection.
  • Bonus: Provide a way to override the host list when populating current_primary. This will enable deploying all nodes as part of an existing replica set.

The environment variable override is working well, some way of tieing these changes together and avoiding bad combinations would be good to find.

TLS Connection over 127.0.0.1

Description

The libmongodb.sh script runs commands against localhost (127.0.0.1) during initialization. These operations createall the users and importantly configure the connection to a replicaset.

Issues

All these commands fail through TLS unless 127.0.0.1 is part of the certificate. (Uncommon for external certificates)

Workaround

During setup the mongod server is bound to the localhost IP.

There are two options with this work around:

  1. Provide a way of connecting to 127.0.0.1 using a domain in the certificate
  2. Use the external domain and alter the mongodb.conf to bind all ips before the first time mongod is started allowing connections from outside the pod.

We preferred option 1.

Workaround 1

First add a localhost.mongodb.example.com domain to our certificate and route that domain to 127.0.0.1

hostAliases:
- ip: 127.0.0.1
  hostnames:
  - "localhost.mongodb.example.com"

We add another step to the prepatory script described above to alters the libmongodb.sh file.

sed -i "s/127.0.0.1/localhost.mongodb.example.com/g" /opt/bitnami/scripts/libmongodb.sh

This file can be altered in place and then you can continue as normal!

NOTE: I have been unable to find a way to add a pod specific domain to the hosts file. If this is possible we could remove the need for a localhost.* domain.

Workaround 2:

Similar to the workaround 1 we replace the 127.0.0.1 in libmongo, this time with the advertised hostname.

Since this will resolve to an external IP we also need to enable bindAll Ips before mongod start.

Something like:

sed -i "s/127.0.0.1/${MONGODB_ADVERTISED_HOSTNAME}/g" /opt/bitnami/scripts/libmongodb.sh


# Load libraries
. /opt/bitnami/scripts/libmongodb.sh
# Load environment
. /opt/bitnami/scripts/mongodb-env.sh
mongodb_set_listen_all_conf "$MONGODB_CONF_FILE"

There is a risk that a bad actor will come along and create a root user before you can since the server is now exposed, but it is minimal.

Suggested Chart Changes

The fix for this will require a change to the scripts in the container.

  • Use an environment variable to be used in place of 127.0.0.1 in the libmongo.sh script. (MONGODB_LOCALHOST_NAME?)
  • The chart changes to provide this environment variable.

It could also be done to ensure that this host only resolves to 127.0.0.1, or use the chart to set it as a default hostalias.

Probes don't work

Description

The probes connect with the TLS options, but do not specify a hostname which defaults to 127.0.0.1.

Workaround

This issue has forced me to provide custom probes which are only altered to provide the hostname of MONGODB_ADVERTISED_HOSTNAME

We could also use MONGODB_LOCALHOST_NAME / localhost.mongodb.example.com if it is available through the TLS certificate and configured in the hostAliases.

This has the benefit of routing traffic internally, so it is directly checking on the pod health and not implicitly checking network connectivity.

Metrics Cant Connect

Description

The metrics sidecar by default uses the localhost as the host in the connection string.

The metrics sidecar is unable to connect to mongod as it fails the tls check

Workaround

I have manually defined the args for the side car to use localhost.mongodb.example.com.

Suggested Chart Changes

Having access to MONGODB_ADVERTISED_HOSTNAME, /shared/info.txt or MONGODB_LOCALHOST_NAME would make a more DRY approach to correcting this connection string.

(There are other issues with metrics, but that is outside the scope of this issue)

Metadata

Metadata

Assignees

Labels

mongodbsolvedtech-issuesThe user has a technical issue about an application

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions