Skip to content
This repository has been archived by the owner on Apr 25, 2024. It is now read-only.

BugReport reporter #770

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

BugReport reporter #770

wants to merge 3 commits into from

Conversation

gtrepta
Copy link
Contributor

@gtrepta gtrepta commented Dec 8, 2023

@gtrepta gtrepta requested a review from tothtamas28 December 8, 2023 19:37
@gtrepta gtrepta marked this pull request as ready for review December 8, 2023 19:59
src/pyk/utils.py Outdated Show resolved Hide resolved
@gtrepta gtrepta requested a review from tothtamas28 December 11, 2023 17:59
src/pyk/utils.py Outdated
Comment on lines 491 to 498
def add_file(self, finput: Path, arcname: Path) -> None:
pass

def add_file_contents(self, input: str, arcname: Path) -> None:
pass

def add_command(self, args: Iterable[str]) -> None:
pass
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By organizing the code this way, this will not break:

br = BugReporter(Path('foo'))
br.add_file(Path('bar'), Path('baz'))

yet the bug report entry will not be written.

BugReporter should not be a BugReport, and these methods should be removed from BugReport.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then spurious methods like _check_bug_report can also be removed from BugReporter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should users instantiate a bug report? It looks to me that currently the bug_report_arg parameter from cli is the primary use case, and I've ensured that the reporter coming from there works. The intent was to have BugReporter only be instantiated by a call to BugReport.reporter, and have BugReport hold the base interface, file paths, and synchronization primitive to spawn reporters from. Something like your example that doesn't write to a file can certainly be a problem if it isn't discouraged in some way, like renaming the class to _BugReporter so it doesn't get imported or used as easily.

If we want to allow users to directly create a BugReport or a BugReporter then I'll have to think a bit more about what this should look like.

I made BugReporter an inherited class because BugReport is expected in many places like KRun and Kompile, and wanted to keep those interfaces stable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should users instantiate a bug report?

In the final design, users should be able to instantiate a BugReport (i.e. the BugReporter factory) directly. On the other hand, BugReporter (i.e. the replacement of the old BugReport) should only be instantiated through BugReporter.

(We can also consider a composite design where a BugReporter can instantiate further BugReporter instances, e.g. a BugReporter for a proof process creates a BugReporter for one of the KoreClient instances it manages.)

BugReport is expected in many places like KRun and Kompile, and wanted to keep those interfaces stable.

I see why this is a good approach, but ultimately the goal is to get rid of the old interface, and instead

  • Let the user decide where to write the archive: BugReport.
  • Let the application (e.g. kontrol) decide how to structure the archive: BugReporter.
  • Let the library (e.g. KoreServer) be responsible only for what's written to the archive.

One way to achieve this:

  1. Turn BugReport into an abstract class
  2. Have two implementations: 1) old behavior 2) BugReporter
  3. Gradually replace old behavior by BugReporter
  4. Do the renamings

All this doesn't have to be achieved in a single PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've force pushed this branch with a change that implements 1 and 2. There's no thread safety over the BugReporter yet, if you're happy with deferring that to a later PR then we should be able to merge this.

src/pyk/utils.py Outdated
_command_id: int
_defn_id: int
_file_remap: dict[str, str]
_lock: Lock
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What properties does this lock ensure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the usage pattern expected in my above comment, this Lock should be shared by all BugReporters that were created off of the same BugReport, so that only one of them can write to the file at a time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in a multiprocessing scenario, i.e. when the Lock gets pickled and sent to another Python process? Will it still ensure mutual exclusion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the looks of it, pickling Lock objects isn't supported. Would this be an issue?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we use multiprocessing quite intensively in kontrol.

An alternative would be a file lock (filelock, cross-platform) or OS level locking (os.lockf, fcntl, only on Unix). Also, please check whether tarfile does any form of locking, maybe it already does.

src/pyk/utils.py Outdated
def add_command(self, args: Iterable[str]) -> None:
pass

def reporter(self, prefix: Path) -> BugReporter:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def reporter(self, prefix: Path) -> BugReporter:
def reporter(self, prefix: str) -> BugReporter:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if two threads request a reporter to the same prefix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They will both be writing to the same prefix in the bug report archive. The Lock should ensure that they don't write at the same time, but it is possible that they write to the same path, which in this implementation I believe will work like an append. It should probably be up to the user whether that's an issue or not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can safely introduce this restriction (i.e. let the BugReporter own its prefix) as the way we use the archive is always adding new things to it.

If the same prefix is requested twice, the call can either fail, or the target directory can be renamed to make it unique:

foo
foo_1
foo_2
...

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming while logging a message about it sounds reasonable.

@gtrepta gtrepta force-pushed the bugreport-reporter branch from 56f932e to 078e994 Compare February 1, 2024 20:34
@@ -524,3 +521,26 @@ def _remap_arg(_a: str) -> str:
shebang = '#!/usr/bin/env bash\nset -euxo pipefail\n'
self.add_file_contents(shebang + ' '.join(remapped_args) + '\n', arcname)
self._command_id += 1


class BugReport(BugReportBase):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider this hierarchy instead:

  • BugReportBase - abstract class without any actual implementation.
  • BugReport - same as before, but now implements BugReportBase.
  • BugReport2 - something that can instantiate a BugReporter for a prefix.
  • BugReporter - a BugReportBase that writes into a folder of an archive.

This is the easy part as so far, nothing breaks. The hard part is eliminating BugReport in favor of the new design, which will be a breaking change in pyk and in downstream projects.

We should only merge the PR with the new classes once we have a plan on how to manage the rest. To get the intuition of what needs to be changed, you can rename BugReport and see where the type checker complains.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BugReport should not be written directly
2 participants