feat: add tool to check if GPUs are available for training #102

vaibs-d · 2025-04-20T23:01:54Z

This PR adds tools for the MLE agent to asses if GPUs are available via Ray or locally to use for model training.

* chore: add ray as config * fix: ray integration test

cubic-dev-ai

mrge found 5 issues across 6 files. View them in mrge.io

scripts/test_ray_config.py

scripts/gpu_test.py

tests/integration/test_ray_integration.py

plexe/internal/models/tools/hardware.py

marcellodebernardi

Nice feature @vaibs-d ! I have a couple of questions about how the new tools plug into the agentic workflow, see comments. Otherwise, excited to get this feature merged 🚀

marcellodebernardi · 2025-04-24T00:20:56Z

plexe/internal/models/tools/execution.py

-def get_executor_tool(distributed: bool = False) -> Callable:
-    """Get the appropriate executor tool based on the distributed flag."""
+def get_executor_tool() -> Callable:
+    """Get the executor tool for training code execution."""


We actually don't need this to be wrapped in a closure anymore, since we're not passing any parameters to it.

marcellodebernardi · 2025-04-24T00:37:01Z

plexe/internal/models/tools/hardware.py

+@tool
+def get_gpu_info() -> dict:
+    """
+    Get available GPU information for code generation.


I have two questions about this tool:

It doesn't seem to be added to the list of tools for any of the agents, unless I'm missing something?

I'm not sure how the agent is expected to use this tool. Is the idea that the agent would check using this tool if different ML frameworks have access to GPU?

I think I see a couple of issues here:

The docstring may be a little too terse for an agent to reliably understand what this is for.

This should be added to an agent's tools, and the agent's prompts should maybe also be modified to suggest using this

This will check for GPU availability on the local compute instance; if we're using a Ray cluster, this would not check for the presence of GPUs in the Ray cluster

marcellodebernardi · 2025-04-24T00:38:53Z

plexe/internal/models/tools/hardware.py

+@tool
+def get_ray_info() -> dict:
+    """
+    Get Ray cluster information including GPU availability.


Same as the tool above, I'm not entirely clear on how an agent should use this tool. The docstring probably doesn't give an agent enough context about what this tool is really "for" and how it fits in the broader workflow.

At least with GPT-4o and Claude 3.7 I've seen that the tool docstrings need to be very unambiguous for the agent to reliably use the tool correctly.

plexe/internal/models/tools/execution.py

…ibuted flag

vaibs-d and others added 2 commits April 20, 2025 14:07

feat: add tool for detecting GPUs

d69c408

add tests

a164b62

vaibs-d changed the title ~~Add gpu support~~ feat: add tool to check if GPUs are available for training Apr 20, 2025

vaibs-d added 5 commits April 20, 2025 16:15

chore: cleanup test files

d7d1a2c

fix: linting

53e9620

Gpu ray (#103)

5b1b263

* chore: add ray as config * fix: ray integration test

fix: linting

157c322

fix: linting

3089ddd

vaibs-d marked this pull request as ready for review April 21, 2025 03:35

bump version

3582275

cubic-dev-ai bot reviewed Apr 21, 2025

View reviewed changes

vaibs-d and others added 2 commits April 20, 2025 21:16

fix: linting

8e04d8d

Merge branch 'main' into add-gpu-support

4a786cf

marcellodebernardi requested changes Apr 24, 2025

View reviewed changes

vaibs-d added 4 commits April 25, 2025 20:52

feat: add tests for GPU detection (#105)

eb3b9d6

Merge branch 'main' into add-gpu-support

9349447

feat: add GPU integration test with XGBoost and Ray

da26abb

refactor: remove duplicate GPU detection code and unnecessary scripts

79afb07

vaibs-d requested a review from marcellodebernardi April 26, 2025 05:39

vaibs-d added 3 commits April 25, 2025 22:50

fix: linting

c4c1025

fix: test

5546de3

Merge main branch and keep our GPU integration approach without distr…

903b1b9

…ibuted flag

vaibs-d force-pushed the add-gpu-support branch from 0e5c7db to 903b1b9 Compare April 26, 2025 06:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add tool to check if GPUs are available for training #102

feat: add tool to check if GPUs are available for training #102

vaibs-d commented Apr 20, 2025 •

edited

Loading

cubic-dev-ai bot left a comment

marcellodebernardi left a comment

marcellodebernardi Apr 24, 2025

marcellodebernardi Apr 24, 2025

marcellodebernardi Apr 24, 2025

feat: add tool to check if GPUs are available for training #102

Are you sure you want to change the base?

feat: add tool to check if GPUs are available for training #102

Conversation

vaibs-d commented Apr 20, 2025 • edited Loading

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

marcellodebernardi left a comment

Choose a reason for hiding this comment

marcellodebernardi Apr 24, 2025

Choose a reason for hiding this comment

marcellodebernardi Apr 24, 2025

Choose a reason for hiding this comment

marcellodebernardi Apr 24, 2025

Choose a reason for hiding this comment

vaibs-d commented Apr 20, 2025 •

edited

Loading