Skip to content

Add new methods to manage instance server (reinstall, backup create, info, download, delete #437

Open
aliel wants to merge 3 commits intomasterfrom
aliel-add-instance-backup-restore-operations
Open

Add new methods to manage instance server (reinstall, backup create, info, download, delete #437
aliel wants to merge 3 commits intomasterfrom
aliel-add-instance-backup-restore-operations

Conversation

@aliel
Copy link
Member

@aliel aliel commented Feb 21, 2026

@github-actions
Copy link

Failed to retrieve llama text: POST 500: {"error": "500: Unhandled error during initialisation"}

@aliel aliel force-pushed the aliel-add-instance-backup-restore-operations branch from 782e86c to 4bce0b7 Compare February 21, 2026 21:54
@aliel aliel requested a review from odesenfans February 21, 2026 21:54
@codecov
Copy link

codecov bot commented Feb 21, 2026

Codecov Report

❌ Patch coverage is 19.16667% with 194 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.29%. Comparing base (58a1afc) to head (e88dc25).

Files with missing lines Patch % Lines
src/aleph_client/commands/instance/backup.py 15.38% 176 Missing ⚠️
src/aleph_client/commands/instance/__init__.py 43.75% 18 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #437      +/-   ##
==========================================
- Coverage   60.70%   58.29%   -2.42%     
==========================================
  Files          21       22       +1     
  Lines        4057     4287     +230     
  Branches      606      642      +36     
==========================================
+ Hits         2463     2499      +36     
- Misses       1323     1517     +194     
  Partials      271      271              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@aliel aliel force-pushed the aliel-add-instance-backup-restore-operations branch 3 times, most recently from a7c4496 to 2c3f6c7 Compare February 23, 2026 16:56
@aliel aliel force-pushed the aliel-add-instance-backup-restore-operations branch from 2c3f6c7 to e88dc25 Compare February 23, 2026 17:13
Comment on lines +101 to +106
status, result = await manager.create_backup(
vm_id=vm_id,
include_volumes=include_volumes,
skip_fsfreeze=skip_fsfreeze,
run_async=background,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you risk looping forever? Why not call the corresponding GET method here instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, no timeout.

Optional[str],
typer.Option(help="CRN domain on which the VM is running"),
] = None,
chain: Annotated[Optional[Chain], typer.Option(help=help_strings.PAYMENT_CHAIN_USED)] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing metavar=metavar_valid_chains on all appearances of chain in this file.

account: AccountTypes = load_account(private_key, private_key_file, chain=chain)

async with VmClient(account, domain) as manager:
status, result = await manager.delete_backup(vm_id=vm_id, backup_id=backup_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No user confirmation required for this one?

Copy link

@foxpatch-aleph foxpatch-aleph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR adds useful backup/restore/reinstall commands with good UX (progress bars, warnings, confirmation prompts). However, there is one functional bug that makes the --background polling loop create new backups on every poll iteration instead of checking status, and a subtle error-handling bug in reinstall where return 1 silently succeeds from Typer's perspective. There are also no tests for any of the new functionality.

src/aleph_client/commands/instance/backup.py (line 101): Bug: polling loop re-invokes create_backup instead of checking status. Each iteration of the while status == 202 loop calls manager.create_backup(...) again, which will attempt to start a new backup rather than poll the status of the one already running. This should call a status/get endpoint (e.g. manager.get_backup(vm_id=vm_id)) to check whether the in-progress backup has completed.

# Should be something like:
status, result = await manager.get_backup(vm_id=vm_id)

src/aleph_client/commands/instance/__init__.py (line 1170): Bug: return 1 does not set the process exit code in a Typer command. Typer ignores integer return values from command functions; return 1 here exits the function normally with exit code 0. Every other error path in this PR uses raise typer.Exit(1) — use that here for consistency and correctness:

raise typer.Exit(1)

src/aleph_client/commands/instance/backup.py (line 209): Partial output file left on disk after a failed/interrupted download. If the connection drops mid-transfer, output will contain a partial file with no indication to the user. Consider deleting the output file in an error handler, or writing to a temp file and renaming on success:

tmp = output.with_suffix(output.suffix + '.tmp')
try:
    with open(tmp, 'wb') as f:
        ...
    tmp.rename(output)
except:
    tmp.unlink(missing_ok=True)
    raise

src/aleph_client/commands/instance/backup.py (line 293): Missing confirmation prompt before destructive delete operation. All other destructive commands (reinstall, restore) show a warning and ask for confirmation. delete silently deletes the backup — a safety net the user may need — without any prompt. Consider adding a yes_no_input("Delete backup {backup_id}?", default=False) guard.

src/aleph_client/commands/instance/backup.py (line 404): ProgressFile class defined inside a with block inside a function. This is hard to read, impossible to unit-test independently, and triggers linter warnings. Extract it as a module-level private class _ProgressFile.

src/aleph_client/commands/instance/__init__.py (line 1161): Minor inconsistency: reinstall uses Prompt.ask(...) (rich) while all backup.py commands use typer.prompt(...) for the same CRN URL prompt. Pick one; typer.prompt is used everywhere else in the codebase.

src/aleph_client/commands/instance/backup.py (line 1): No tests added. None of the five new commands (create, info, download, delete, restore) or reinstall have test coverage. At minimum, unit tests for the argument-validation logic (mutual-exclusion checks, file-not-found, domain resolution fallback) and mocked happy-path tests for each command would significantly reduce regression risk for these destructive operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants