Skip to content
Open
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
a063b27
feat: add single-task repeat mode with --repeat and --continue-on-fai…
KLIEBHAN Mar 9, 2026
837bf8f
fix: adjust codex engine for repeat mode compatibility
KLIEBHAN Mar 9, 2026
39a8303
feat: align bash CLI parity with npm CLI
KLIEBHAN Mar 9, 2026
e984745
docs: update README and add example PRD/tasks files
KLIEBHAN Mar 9, 2026
05ade13
refactor: simplify args parsing and deduplicate task error handling
KLIEBHAN Mar 9, 2026
45bff6d
fix: address PR review feedback
KLIEBHAN Mar 9, 2026
a0bf467
fix: add --repeat upper bound and bash CLI parity warning
KLIEBHAN Mar 10, 2026
e6a3e22
fix: enforce --repeat upper bound in bash CLI and add boundary test
KLIEBHAN Mar 10, 2026
a5f2cdd
fix: address PR review feedback
KLIEBHAN Mar 10, 2026
71f5ca2
fix: show skipped count in repeat loop summary
KLIEBHAN Mar 10, 2026
1bc3373
fix: show skipped count in bash repeat loop summary
KLIEBHAN Mar 10, 2026
a20b9c5
fix: add warning test and forward --model to all codex invocations
KLIEBHAN Mar 10, 2026
f387ef0
fix: show skipped count in desktop notification for repeat mode
KLIEBHAN Mar 11, 2026
44f6a83
fix: run dry-run only once regardless of --repeat count
KLIEBHAN Mar 11, 2026
baf3f56
fix: add dry-run guard to bash brownfield task execution
KLIEBHAN Mar 11, 2026
4e05f85
fix: reject scientific notation in --repeat and add dry-run test
KLIEBHAN Mar 13, 2026
f7474df
fix: disable browser automation by default (opt-in via --browser)
KLIEBHAN Mar 15, 2026
76a326d
fix: align bash browser default with TS CLI (remove auto mode)
KLIEBHAN Mar 16, 2026
a77835b
fix: suppress streaming output in repeat mode for cleaner terminal
KLIEBHAN Mar 16, 2026
66ceb9f
fix: log failed progress on engine unavailability and reorder validation
KLIEBHAN Mar 16, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions PRD.example.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Example Project PRD

Ein einfaches Beispiel-Projekt, um Ralphy zu demonstrieren.

## Kontext

Wir bauen eine kleine CLI-Anwendung, die Markdown-Dateien in HTML konvertiert.

## Tasks

- [ ] Projekt-Setup: package.json erstellen mit TypeScript und Vitest
- [ ] Funktion schreiben, die Markdown-Headings (#, ##, ###) in HTML-Tags konvertiert
- [ ] Funktion schreiben, die **bold** und *italic* Text konvertiert
- [ ] Funktion schreiben, die Listen (- item) in HTML-Listen konvertiert
- [ ] CLI-Entry-Point erstellen, der eine Datei einliest und konvertiert
- [ ] README.md mit Nutzungsanleitung schreiben

## Akzeptanzkriterien

- Alle Tests gruen
- TypeScript kompiliert ohne Fehler
- CLI kann mit `npx ts-node src/index.ts input.md` ausgefuehrt werden

## Notizen

- Keine externen Markdown-Libraries verwenden (Lernzweck)
- Einfache Regex-basierte Konvertierung reicht aus
7 changes: 6 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,16 @@ cd ralphy && chmod +x ralphy.sh
./ralphy.sh --prd PRD.md
```

Both versions have identical features. Examples below use `ralphy` (npm) - substitute `./ralphy.sh` if using the bash script.
Examples below use `ralphy` (npm). Most commands also work with `./ralphy.sh`, but newer npm CLI features may land there first.

## Two Modes

**Single task** - just tell it what to do:
```bash
ralphy "add dark mode"
ralphy "fix the auth bug"
ralphy --repeat 3 "find and fix bugs"
ralphy --repeat 5 --continue-on-failure "harden edge cases"
```

**Task list** - work through a PRD:
Expand Down Expand Up @@ -319,6 +321,8 @@ ralphy --parallel --sandbox
| `--max-retries N` | retries per task (default: 3) |
| `--retry-delay N` | seconds between retries |
| `--dry-run` | preview only |
| `--repeat N` | repeat a single task N times (requires task argument) |
| `--continue-on-failure` | in repeat mode, continue after non-fatal task failures |
| `--browser` | enable browser automation |
| `--no-browser` | disable browser automation |
| `-v, --verbose` | debug output |
Expand Down Expand Up @@ -363,6 +367,7 @@ When an engine exits non-zero, ralphy includes the last lines of CLI output in t
## Changelog

### v4.7.2
- **Single-task repeat mode**: added `--repeat <n>` with `--continue-on-failure` and fail-fast defaults; fatal errors still abort immediately
- **Improved auth error detection**: simplified `extractAuthenticationError` function with better edge case handling (e.g., JSON dumps during login)
- **Added project standards**: `CLAUDE.md`, `.cursorrules`, `CONTRIBUTING.md` for consistent AI-assisted development
- **Enhanced default prompts**: enforce concise, focused code changes
Expand Down
4 changes: 4 additions & 0 deletions cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ ralphy --prd PRD.md
```bash
ralphy "add dark mode"
ralphy "fix the auth bug"
ralphy --repeat 3 "find and fix bugs"
ralphy --repeat 5 --continue-on-failure "harden edge cases"
```

**Task list** - work through a PRD:
Expand Down Expand Up @@ -307,6 +309,8 @@ ralphy --parallel --sandbox
| `--max-retries N` | retries per task (default: 3) |
| `--retry-delay N` | seconds between retries |
| `--dry-run` | preview only |
| `--repeat N` | repeat a single task N times (requires task argument) |
| `--continue-on-failure` | in repeat mode, continue after non-fatal task failures |
| `--browser` | enable browser automation |
| `--no-browser` | disable browser automation |
| `-v, --verbose` | debug output |
Expand Down
94 changes: 94 additions & 0 deletions cli/src/cli/__tests__/args.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
import { beforeAll, describe, expect, it, mock, spyOn } from "bun:test";

let parseArgs: typeof import("../args.ts").parseArgs;

beforeAll(async () => {
mock.module("../../version.ts", () => ({
VERSION: "test",
}));
({ parseArgs } = await import("../args.ts"));
});

function parseCliArgs(args: string[]) {
return parseArgs(["bun", "ralphy", ...args]);
}

describe("parseArgs repeat options", () => {
it("parses --repeat 5 with task", () => {
const { options, task } = parseCliArgs(["--repeat", "5", "do something"]);
expect(task).toBe("do something");
expect(options.repeatCount).toBe(5);
expect(options.continueOnFailure).toBe(false);
});

it("throws on --repeat 0", () => {
expect(() => parseCliArgs(["--repeat", "0", "task"])).toThrow(
"--repeat must be an integer between 1 and 10000",
);
});

it("throws on --repeat -1", () => {
expect(() => parseCliArgs(["--repeat", "-1", "task"])).toThrow(
"--repeat must be an integer between 1 and 10000",
);
});

it("throws on --repeat abc", () => {
expect(() => parseCliArgs(["--repeat", "abc", "task"])).toThrow(
"--repeat must be an integer between 1 and 10000",
);
});

it("throws on --repeat 1.5", () => {
expect(() => parseCliArgs(["--repeat", "1.5", "task"])).toThrow(
"--repeat must be an integer between 1 and 10000",
);
});

it("throws on --repeat 10001", () => {
expect(() => parseCliArgs(["--repeat", "10001", "task"])).toThrow(
"--repeat must be an integer between 1 and 10000",
);
});

it("parses --repeat with --continue-on-failure", () => {
const { options } = parseCliArgs(["--repeat", "3", "--continue-on-failure", "task"]);
expect(options.repeatCount).toBe(3);
expect(options.continueOnFailure).toBe(true);
});
Comment thread
greptile-apps[bot] marked this conversation as resolved.

it("throws when --repeat is used without task", () => {
expect(() => parseCliArgs(["--repeat", "3"])).toThrow(
"--repeat and --continue-on-failure require a task argument",
);
});

it("throws when --continue-on-failure is used without task", () => {
expect(() => parseCliArgs(["--continue-on-failure"])).toThrow(
"--repeat and --continue-on-failure require a task argument",
);
});

it("warns when --continue-on-failure is used without --repeat but with a task", () => {
const warnSpy = spyOn(console, "warn");
const { options } = parseCliArgs(["--continue-on-failure", "do something"]);
expect(options.continueOnFailure).toBe(true);
expect(options.repeatCount).toBe(1);
expect(warnSpy).toHaveBeenCalledWith(
"Warning: --continue-on-failure has no effect without --repeat",
);
warnSpy.mockRestore();
});

it("throws when repeat options are combined with task source flags", () => {
expect(() => parseCliArgs(["--repeat", "3", "--yaml", "tasks.yaml", "task"])).toThrow(
"--repeat and --continue-on-failure cannot be used with --prd, --yaml, --json, or --github",
);
});

it("defaults to repeatCount 1", () => {
const { options } = parseCliArgs(["task"]);
expect(options.repeatCount).toBe(1);
expect(options.continueOnFailure).toBe(false);
});
});
40 changes: 38 additions & 2 deletions cli/src/cli/args.ts
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ export function createProgram(): Command {
.option("--copilot", "Use GitHub Copilot")
.option("--gemini", "Use Gemini CLI")
.option("--dry-run", "Show what would be done without executing")
.option("--repeat <n>", "Repeat single task N times")
.option("--continue-on-failure", "Continue repeat loop on task failure")
.option("--max-iterations <n>", "Maximum iterations (0 = unlimited)", "0")
.option("--max-retries <n>", "Maximum retries per task", "3")
.option("--retry-delay <n>", "Delay between retries in seconds", "5")
Expand Down Expand Up @@ -62,6 +64,12 @@ export function createProgram(): Command {
return program;
}

function resolveBrowserEnabled(flag: boolean | undefined): "auto" | "true" | "false" {
if (flag === true) return "true";
if (flag === false) return "false";
return "auto";
}

/**
* Parse command line arguments into RuntimeOptions
*/
Expand All @@ -88,7 +96,33 @@ export function parseArgs(args: string[]): {
const opts = program.opts();
const [task] = program.args;

// Determine AI engine (--sonnet implies --claude)
// --prd has a commander default, so opts.prd alone cannot detect explicit usage
const taskSourceFlags = ["--prd", "--yaml", "--json", "--github"];
const hasExplicitTaskSourceFlag = ralphyArgs.some((arg) =>
taskSourceFlags.some((flag) => arg === flag || arg.startsWith(`${flag}=`)),
);

const repeatProvided = opts.repeat !== undefined;
const repeatCount = repeatProvided ? Number(opts.repeat) : 1;
if (repeatProvided && (!Number.isInteger(repeatCount) || repeatCount < 1 || repeatCount > 10_000)) {
throw new Error("--repeat must be an integer between 1 and 10000");
}
Comment thread
greptile-apps[bot] marked this conversation as resolved.

const continueOnFailure = opts.continueOnFailure || false;
if (continueOnFailure && !repeatProvided && task) {
console.warn("Warning: --continue-on-failure has no effect without --repeat");
}
const hasRepeatOptions = repeatProvided || continueOnFailure;
if (hasRepeatOptions && !task) {
throw new Error("--repeat and --continue-on-failure require a task argument");
}
Comment thread
greptile-apps[bot] marked this conversation as resolved.
if (hasRepeatOptions && hasExplicitTaskSourceFlag) {
throw new Error(
"--repeat and --continue-on-failure cannot be used with --prd, --yaml, --json, or --github",
);
}
Comment thread
greptile-apps[bot] marked this conversation as resolved.

// --sonnet implies --claude and takes priority over other engine flags
let aiEngine = "claude";
if (opts.sonnet) aiEngine = "claude";
else if (opts.opencode) aiEngine = "opencode";
Expand Down Expand Up @@ -140,6 +174,8 @@ export function parseArgs(args: string[]): {
maxIterations: Number.parseInt(opts.maxIterations, 10) || 0,
maxRetries: Number.parseInt(opts.maxRetries, 10) || 3,
retryDelay: Number.parseInt(opts.retryDelay, 10) || 5,
repeatCount,
continueOnFailure,
verbose: opts.verbose || false,
branchPerTask: opts.branchPerTask || false,
baseBranch: opts.baseBranch || "",
Expand All @@ -154,7 +190,7 @@ export function parseArgs(args: string[]): {
githubLabel: opts.githubLabel || "",
syncIssue: opts.syncIssue ? Number.parseInt(opts.syncIssue, 10) || undefined : undefined,
autoCommit: opts.commit !== false,
browserEnabled: opts.browser === true ? "true" : opts.browser === false ? "false" : "auto",
browserEnabled: resolveBrowserEnabled(opts.browser),
modelOverride,
skipMerge: opts.merge === false,
useSandbox: opts.sandbox || false,
Expand Down
80 changes: 80 additions & 0 deletions cli/src/cli/commands/single-task-loop.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
import { describe, expect, it } from "bun:test";
import { DEFAULT_OPTIONS } from "../../config/types.ts";
import { runSingleTaskLoop } from "./single-task-loop.ts";

Comment thread
greptile-apps[bot] marked this conversation as resolved.
describe("runSingleTaskLoop", () => {
it("stops on first non-fatal failure in fail-fast mode", async () => {
let calls = 0;
const result = await runSingleTaskLoop(
"task",
{
...DEFAULT_OPTIONS,
repeatCount: 3,
continueOnFailure: false,
},
{
runTaskFn: async () => {
calls++;
return { success: false, fatal: false, error: "boom" };
},
logInfoFn: () => {},
},
);

expect(calls).toBe(1);
expect(result.completed).toBe(0);
expect(result.failed).toBe(1);
expect(result.total).toBe(3);
});

it("continues on non-fatal failures when continue-on-failure is enabled", async () => {
let call = 0;
const sequence = [
{ success: false, fatal: false, error: "first" },
{ success: true, fatal: false },
{ success: false, fatal: false, error: "last" },
] as const;

const result = await runSingleTaskLoop(
"task",
{
...DEFAULT_OPTIONS,
repeatCount: 3,
continueOnFailure: true,
},
{
runTaskFn: async () => sequence[call++] ?? sequence[sequence.length - 1],
logInfoFn: () => {},
},
);

expect(call).toBe(3);
expect(result.completed).toBe(1);
expect(result.failed).toBe(2);
expect(result.total).toBe(3);
});

it("always stops on fatal failures", async () => {
let calls = 0;
const result = await runSingleTaskLoop(
"task",
{
...DEFAULT_OPTIONS,
repeatCount: 5,
continueOnFailure: true,
},
{
runTaskFn: async () => {
calls++;
return { success: false, fatal: true, error: "auth failed" };
},
logInfoFn: () => {},
},
);

expect(calls).toBe(1);
expect(result.completed).toBe(0);
expect(result.failed).toBe(1);
expect(result.total).toBe(5);
});
});
57 changes: 57 additions & 0 deletions cli/src/cli/commands/single-task-loop.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
import type { RuntimeOptions } from "../../config/types.ts";
import { logInfo } from "../../ui/logger.ts";
import { type TaskRunResult, runTask } from "./task.ts";

type TaskRunner = (task: string, options: RuntimeOptions) => Promise<TaskRunResult>;
type InfoLogger = (message: string) => void;

export interface SingleTaskLoopResult {
total: number;
completed: number;
failed: number;
Comment on lines +8 to +11
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skipped missing from SingleTaskLoopResult interface

The skipped count is computed inside the function body (line 52: const skipped = total - completed - failed) and also re-computed by callers — index.ts repeats result.total - result.completed - result.failed twice. Storing the formula in two independent places is fragile: if the semantics of skipped ever change, both sites need updating.

Consider adding skipped directly to the interface so callers can use it without re-deriving it:

Suggested change
export interface SingleTaskLoopResult {
total: number;
completed: number;
failed: number;
export interface SingleTaskLoopResult {
total: number;
completed: number;
failed: number;
skipped: number;
}

Then populate it in the return statement (skipped: total - completed - failed) and remove the duplicate computation in index.ts.

Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/cli/commands/single-task-loop.ts
Line: 8-11

Comment:
**`skipped` missing from `SingleTaskLoopResult` interface**

The `skipped` count is computed inside the function body (line 52: `const skipped = total - completed - failed`) and also re-computed by callers — `index.ts` repeats `result.total - result.completed - result.failed` twice. Storing the formula in two independent places is fragile: if the semantics of `skipped` ever change, both sites need updating.

Consider adding `skipped` directly to the interface so callers can use it without re-deriving it:

```suggestion
export interface SingleTaskLoopResult {
	total: number;
	completed: number;
	failed: number;
	skipped: number;
}
```

Then populate it in the return statement (`skipped: total - completed - failed`) and remove the duplicate computation in `index.ts`.

How can I resolve this? If you propose a fix, please make it concise.

}

/**
* Run the single-task flow with optional repeat behavior.
*/
export async function runSingleTaskLoop(
task: string,
options: RuntimeOptions,
deps?: {
runTaskFn?: TaskRunner;
logInfoFn?: InfoLogger;
},
): Promise<SingleTaskLoopResult> {
const runTaskFn = deps?.runTaskFn ?? runTask;
const logInfoFn = deps?.logInfoFn ?? logInfo;

const total = options.repeatCount;
let completed = 0;
let failed = 0;

for (let i = 1; i <= total; i++) {
if (total > 1) {
logInfoFn(`[${i}/${total}] Executing: ${task}`);
}

const result = await runTaskFn(task, options);
if (result.success) {
completed++;
continue;
}

failed++;
if (result.fatal || !options.continueOnFailure) {
break;
Comment on lines +38 to +52
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loop counter printed even on i = 1 when total = 1 is impossible but guard allows i = 1, total > 1

When total = 1, no counter is printed (guarded by total > 1) — this is correct. However, notice that the logInfoFn call at line 40 prints [i/total] before runTaskFn is called. If runTaskFn itself also logs a banner (e.g. "Running task with claude…" in task.ts line 35), the output for --repeat 3 looks like:

[1/3] Executing: find and fix bugs
Running task with claude...       ← emitted inside runTask
...
[2/3] Executing: find and fix bugs
Running task with claude...

The inner logInfo in task.ts fires unconditionally for every iteration. Combined with the iteration counter here, this creates duplicate/noisy output for repeat runs. The "Repetitive per-iteration log noise" concern is still unaddressed in task.ts — but the interplay with this counter makes it more visible.

Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/cli/commands/single-task-loop.ts
Line: 38-52

Comment:
**Loop counter printed even on `i = 1` when `total = 1` is impossible but guard allows `i = 1, total > 1`**

When `total = 1`, no counter is printed (guarded by `total > 1`) — this is correct. However, notice that the `logInfoFn` call at line 40 prints `[i/total]` before `runTaskFn` is called. If `runTaskFn` itself also logs a banner (e.g. `"Running task with claude…"` in `task.ts` line 35), the output for `--repeat 3` looks like:

```
[1/3] Executing: find and fix bugs
Running task with claude...       ← emitted inside runTask
...
[2/3] Executing: find and fix bugs
Running task with claude...
```

The inner `logInfo` in `task.ts` fires unconditionally for every iteration. Combined with the iteration counter here, this creates duplicate/noisy output for repeat runs. The "Repetitive per-iteration log noise" concern is still unaddressed in `task.ts` — but the interplay with this counter makes it more visible.

How can I resolve this? If you propose a fix, please make it concise.

}
}

if (total > 1) {
const skipped = total - completed - failed;
const parts = [`${completed} succeeded`, `${failed} failed`];
if (skipped > 0) parts.push(`${skipped} skipped`);
logInfoFn(`Done: ${parts.join(", ")} of ${total}`);
}

return { total, completed, failed };
}
Loading