eval generator #2

Kvadratni · 2025-03-14T20:43:33Z

This pull request introduces the automatic evaluation generation feature to the ai_migrate tool, enhancing its capabilities by automatically creating evaluation test cases from successful migrations.

DOsinga

Nice. I'll let @jamadeo do the lgtm, here are some random comments

DOsinga · 2025-03-15T19:34:05Z

src/ai_migrate/eval_generator.py

+        info.pr_number = url.split("/pull/")[-1]
+
+    if info.pr_number:
+        if not info.pr_number.isdigit():


DOsinga · 2025-03-15T19:36:14Z

src/ai_migrate/migrate.py

    old_dirs = [
        d for d in examples_dir.iterdir() if d.is_dir() and d.name.endswith(".old")
    ]

    for old_dir in old_dirs:
-        # Construct the corresponding .new directory name
-        base_name = old_dir.name[:-4]  # remove .old
+        base_name = old_dir.name[:-4]


. removesuffix

DOsinga · 2025-03-15T19:36:50Z

src/ai_migrate/migrate.py

@@ -366,7 +362,22 @@ async def run(
    log_stream,
    local_worktrees,
    llm_fakes,
+    target_dir: str = "",
+    target_basename: str = "",
+    dont_create_evals=False,


might as well type the rest too then

DOsinga · 2025-03-15T19:38:17Z

src/ai_migrate/migrate.py

@@ -378,6 +389,10 @@ async def run(
    start_point = (
        await subprocess_run(["git", "rev-parse", "HEAD"], check=True)
    ).strip()
+
+    # Define target_dir_rel_path
+    target_dir_rel_path = None


Hmm, is this set somewhere to something else than None? Should it be what's in 397?

DOsinga · 2025-03-15T19:42:24Z

src/ai_migrate/migrate.py

+        dont_create_evals,
+        target_dir=target_dir,
+        target_dir_rel_path=target_dir_rel_path,
+        target_basename=target_basename,


These don't seem to be used as far as I can tell

DOsinga · 2025-03-15T19:43:21Z

src/ai_migrate/test_eval_generator.py

@@ -0,0 +1,301 @@
+import tempfile


can you remove the AI provided comments here that don't add value?

Forgot to purge the test files.

jamadeo

+1 to Douwe's notes but also there's something we should take a closer look at in pr_utils.py

jamadeo · 2025-03-17T13:58:33Z

src/ai_migrate/pr_utils.py

        if base:
            return f"// Original content of {file_path}\n// This is a placeholder for demonstration purposes"
        else:
            return f"// Modified content of {file_path}\n// This is a placeholder for demonstration purposes"
    except Exception as e:
        print(f"Approach 3 failed: {e}")

-    # If all approaches fail, return a minimal placeholder


Probably should have caught this before because I see it's not part of this diff, but this "Approach 3" doesn't seem to make a lot of sense -- I don't think you'd ever want placeholders for content, I think you'd just want to fail loudly. You'd also never need to put this particular code in a try/except

yeah i'll drop this

eval generator

6cc50f0

DOsinga reviewed Mar 15, 2025

View reviewed changes

jamadeo approved these changes Mar 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eval generator #2

eval generator #2

Kvadratni commented Mar 14, 2025

DOsinga left a comment

DOsinga Mar 15, 2025

DOsinga Mar 15, 2025

DOsinga Mar 15, 2025

DOsinga Mar 15, 2025

DOsinga Mar 15, 2025

DOsinga Mar 15, 2025

Kvadratni Mar 17, 2025

jamadeo left a comment

jamadeo Mar 17, 2025

Kvadratni Mar 17, 2025

eval generator #2

Are you sure you want to change the base?

eval generator #2

Conversation

Kvadratni commented Mar 14, 2025

DOsinga left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jamadeo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment