-
Notifications
You must be signed in to change notification settings - Fork 5k
Description
Helm Chart Version
2.0.19
What step the error happened?
During the Sync
Relevant information
Describe the bug
The container orchestrator in replication jobs ignores all memory configuration (Helm values, environment variables) and always runs with 1Gi memory limit, causing OOM crashes on large CDC syncs.
Environment
- Airbyte Version: 2.0.1
- Helm Chart Version: 2.0.19
- Kubernetes: GKE Standard 1.31
- Deployment: Self-hosted (Kubernetes)
Configuration
We have configured orchestrator memory in multiple places:
1. Helm values (via Terraform):
global:
workloads:
resources:
containerOrchestrator:
memory:
request: 512Mi
limit: 2Gi2. Environment variables in ConfigMap:
REPLICATION_ORCHESTRATOR_MEMORY_LIMIT: 4Gi
REPLICATION_ORCHESTRATOR_MEMORY_REQUEST: 1Gi
3. Environment variables in workload-launcher deployment:
CONTAINER_ORCHESTRATOR_MEMORY_REQUEST: 512Mi
CONTAINER_ORCHESTRATOR_MEMORY_LIMIT: 2Gi
Actual Behavior
Despite all configurations, the orchestrator container in replication pods always has:
orchestrator 1Gi limit 1Gi request
Verified with:
kubectl -n airbyte get pod replication-job-XXX-attempt-0 \
-o jsonpath='{range .spec.containers[*]}{.name}{"\t"}{.resources.limits.memory}{"\n"}{end}'
# Output:
orchestrator 1Gi # Should be 2Gi or 4Gi
source 32Gi # Correctly applied from connection resource_requirements
destination 32Gi # Correctly applied from connection resource_requirementsExpected Behavior
The orchestrator container should respect the configured memory limits (2Gi or 4Gi).
Impact
This causes OOM crashes on CDC syncs with large binlog gaps:
java.lang.OutOfMemoryError: Java heap space
Terminating due to java.lang.OutOfMemoryError: Java heap space
The sync fails after processing only 4 records, even though source/destination have 32Gi available.
Workarounds Attempted
- ✅
connection.resource_requirementsin database → Works for source/destination, NOT for orchestrator - ❌ Helm values
containerOrchestrator.memory→ Ignored - ❌
CONTAINER_ORCHESTRATOR_MEMORY_LIMITenv var → Ignored - ❌
REPLICATION_ORCHESTRATOR_MEMORY_LIMITenv var → Ignored
Related Issues
- workload resource requests and limits not applied for replication jobs #68162 - workload resource requests and limits not applied for replication jobs
- Sync jobs do not appear to respect CPU / memory requests and limits #42921 - Sync jobs do not appear to respect CPU / memory requests and limits
Logs
INFO debezium-engine BaseSourceTask(logStatistics):354 4 records sent during previous 00:00:16.555,
last recorded offset of {server=gestcom} partition is {ts_sec=1770182352, file=mysql-bin.004401, pos=272435933}
# java.lang.OutOfMemoryError: Java heap space
Terminating due to java.lang.OutOfMemoryError: Java heap space
Request
Please provide a way to configure the orchestrator container memory, either:
- Fix the existing configuration options to be respected
- Add support in
connection.resource_requirementsfor orchestrator - Document the correct way to configure orchestrator resources
This is blocking production CDC pipelines for critical business data.
Relevant log output
Internal Tracking: https://github.com/airbytehq/oncall/issues/11158