Skip to content

Container orchestrator ignores CONTAINER_ORCHESTRATOR_MEMORY_LIMIT and REPLICATION_ORCHESTRATOR_MEMORY_LIMIT - always uses 1Gi #72833

@clementmuffatjoly

Description

@clementmuffatjoly

Helm Chart Version

2.0.19

What step the error happened?

During the Sync

Relevant information

Describe the bug

The container orchestrator in replication jobs ignores all memory configuration (Helm values, environment variables) and always runs with 1Gi memory limit, causing OOM crashes on large CDC syncs.

Environment

  • Airbyte Version: 2.0.1
  • Helm Chart Version: 2.0.19
  • Kubernetes: GKE Standard 1.31
  • Deployment: Self-hosted (Kubernetes)

Configuration

We have configured orchestrator memory in multiple places:

1. Helm values (via Terraform):

global:
  workloads:
    resources:
      containerOrchestrator:
        memory:
          request: 512Mi
          limit: 2Gi

2. Environment variables in ConfigMap:

REPLICATION_ORCHESTRATOR_MEMORY_LIMIT: 4Gi
REPLICATION_ORCHESTRATOR_MEMORY_REQUEST: 1Gi

3. Environment variables in workload-launcher deployment:

CONTAINER_ORCHESTRATOR_MEMORY_REQUEST: 512Mi
CONTAINER_ORCHESTRATOR_MEMORY_LIMIT: 2Gi

Actual Behavior

Despite all configurations, the orchestrator container in replication pods always has:

orchestrator    1Gi limit    1Gi request

Verified with:

kubectl -n airbyte get pod replication-job-XXX-attempt-0 \
  -o jsonpath='{range .spec.containers[*]}{.name}{"\t"}{.resources.limits.memory}{"\n"}{end}'

# Output:
orchestrator    1Gi      # Should be 2Gi or 4Gi
source          32Gi     # Correctly applied from connection resource_requirements
destination     32Gi     # Correctly applied from connection resource_requirements

Expected Behavior

The orchestrator container should respect the configured memory limits (2Gi or 4Gi).

Impact

This causes OOM crashes on CDC syncs with large binlog gaps:

java.lang.OutOfMemoryError: Java heap space
Terminating due to java.lang.OutOfMemoryError: Java heap space

The sync fails after processing only 4 records, even though source/destination have 32Gi available.

Workarounds Attempted

  1. connection.resource_requirements in database → Works for source/destination, NOT for orchestrator
  2. ❌ Helm values containerOrchestrator.memory → Ignored
  3. CONTAINER_ORCHESTRATOR_MEMORY_LIMIT env var → Ignored
  4. REPLICATION_ORCHESTRATOR_MEMORY_LIMIT env var → Ignored

Related Issues

Logs

INFO debezium-engine BaseSourceTask(logStatistics):354 4 records sent during previous 00:00:16.555,
last recorded offset of {server=gestcom} partition is {ts_sec=1770182352, file=mysql-bin.004401, pos=272435933}

# java.lang.OutOfMemoryError: Java heap space
Terminating due to java.lang.OutOfMemoryError: Java heap space

Request

Please provide a way to configure the orchestrator container memory, either:

  1. Fix the existing configuration options to be respected
  2. Add support in connection.resource_requirements for orchestrator
  3. Document the correct way to configure orchestrator resources

This is blocking production CDC pipelines for critical business data.

Relevant log output


Internal Tracking: https://github.com/airbytehq/oncall/issues/11158

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions