Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Ensure AutoTuner generates functional configuration files #1489

Open
parthosa opened this issue Jan 7, 2025 · 0 comments
Open

[BUG] Ensure AutoTuner generates functional configuration files #1489

parthosa opened this issue Jan 7, 2025 · 0 comments
Assignees
Labels
bug Something isn't working core_tools Scope the core module (scala)

Comments

@parthosa
Copy link
Collaborator

parthosa commented Jan 7, 2025

Describe the bug
The AutoTuner currently generates a configuration file (rapids_4_spark_qualification_output/tuning/app_xxx.conf) that combines existing application execution configurations with Spark RAPIDS configurations recommended by the AutoTuner. This file is intended to be directly usable for subsequent job runs. However, several issues in the generated configuration file prevent it from being used as-is.

Some of the issues Identified:

  1. Configurations with redacted values that are not functional:
--conf spark.databricks.cloudfetch.requestDownloadUrlsWithHeaders=*********(redacted)
--conf spark.databricks.cloudfetch.requesterClassName=*********(redacted)
  1. Configurations specific to the original execution, which are unnecessary or invalid for future runs:
--conf spark.app.startTime=1666840921589
--conf spark.driver.appUIAddress=<dynamic_ip_address>:<dynamic_port>
--conf spark.driver.host=<dynamic_ip_address>
--conf spark.driver.port=<dynamic_port>
  1. If the Spark RAPIDS JAR is unavailable, the AutoTuner comments about its absence but does not provide a valid JAR path in the generated configuration:
--conf spark.plugins=org.apache.spark.sql.connect.SparkConnectPlugin
  1. Similarly, AutoTuner comments about missing configuration but does not generate a configuration with that value:
- 'spark.rapids.memory.pinnedPool.size' should be set to 2048m.
- Cannot recommend RAPIDS Shuffle Manager for unsupported Spark version: '3.1.3'.
  To enable RAPIDS Shuffle Manager, use a supported Spark version (e.g., '3.5.1')
  and set: '--conf spark.shuffle.manager=com.nvidia.spark.rapids.spark351.RapidsShuffleManager'.
  See supported versions: https://docs.nvidia.com/spark-rapids/user-guide/latest/additional-functionality/rapids-shuffle.html#rapids-shuffle-manager.
  1. Some configurations are set by the CSP environments without user's involvement:
--conf spark.dataproc.sql.joinConditionReorder.enabled=true
--conf spark.dataproc.sql.local.rank.pushdown.enabled=true
--conf spark.executorEnv.PYTHONPATH={{PWD}}/pyspark.zip<CPS>{{PWD}}/py4j-0.10.9-src.zip
--conf spark.metrics.namespace=app_name:${spark.app.name}.app_id:${spark.app.id}

Expected behavior
AutoTuner should generate a clean, functional configuration file which is ready for direct use with Spark RAPIDS plugin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core_tools Scope the core module (scala)
Projects
None yet
Development

No branches or pull requests

2 participants