Skip to content

Use temporary file for async byte compile log #194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

pkryger
Copy link

@pkryger pkryger commented Nov 25, 2024

First thank you for a great package! Have been using it for a good few years, both explicitly to write some tools I need for daily tasks as well as indirectly - installed and enabled by helm for years now!.

TL;DR

This PR is meant to avoid log file contention when multiple packages are compiled in rapid succession. It achieves that by using unique log files in each async process with a help of make-temp-file.

Background

I've been using and maintaing a custom Emacs configuration called Exordium. Exordium has a CI set up, and I noticed that from time to time the CI fails when trying to non-interactively call function ask-user-about-lock.

At first, I tried to follow suggestion to redefine the function, to use simple random-back-off-and-continue strategy. Only after a while I understood that the strategy of function redefinition was futile endeavour. Our configuration uses helm, which in turn enables async-bytecomp-package-mode. This causes the lock contention to be triggered in an asynchronous byte compilation process which doesn't have redefined functions!

Should the above be a correct diagnosis, then I concluded that the issue is within async-bytecompile. Simply put: when multiple packages are installed by user in a rapid succession, then async processes that actually do compile these packages in background, may cause data races and lock contention on the async-byte-compile-log-file.

Solution

I decided to cut this PR to address this issue. The approach taken is relatively simple: use a temporary file - as created by make-temp-file - for each async process. This solution changes where the log is created, but also how the async-byte-compile-log-file is interpreted. I tried to make it backward compatible, but please let me know if I missed some scenarios. This solution has a couple of redeeming qualities which is giving more flexibility to user where log is created (while not contaminating user-emacs-directory by default), not relaying on some less fitting features like PID(see below), and being DRY by using macro expansions.

Verification

I temporarily modified Exordium to use the async package straight from GitHub pointing to this PR's branch. Then I run the CI workflow 50 times to verify the issue doesn't re-occur (it usually did after in dozen or so runs).

Other Possibilities

I have also considered a few other possible avenues to explore for this issue. Below please find a summary of what I though and managed to verbalise. I am sure there are many more, some of them probably even better than the proposed solution and ideas below. I am happy to hear if that's the case!

  1. Let each async process create a unique log file, by just by appending its Emacs PID to the log name. However, there's no strong guarantee on PID reusability, for example, are these unique long enough. Even though I don't think it's likely for two async compilation processes to get the same PID in rapid succession. It matters since, the log file is consumed in the parent process, after the async child has finished. Similarly to the implemented solution the file name is returned from the async compilation process and used by the calling process to display all collected compilation errors in call-backs. I even have implemented that, please see this commit.
  2. Use return value from async processes to pass error data. This could eliminate the need for async-byt-compile-log-file altogether, but I am not sure if the mechanism of passing the return value from an async Emacs process is fit for passing compilation log.
  3. Assign unique identifiers in parent process. At first it seems even simpler, as there is no need to pass the log file name back from async processes to parent. It however seems like less scalable, as there's an issue when there are multiple async processes that are parents to the sub processes. For example when someone does something amongst the lines:
(dolist (pkg-batch package-batches)
  (async-start
    (lambda (pkg-batch)
      (package-initialize)
      (async-bytecomp-package-mode)
      (dolist (pkg package-batch)
        (package-install pkg)))))
  1. Set up Exordium's CI to modify the async-byte-compile-log-file value before it may be used, for example as an advice to async--package-compile. This doesn't seem like a viable option since the call-back code that is meant to access the log file uses the global value, which may have been already overwritten by the concurrent compilation. This could be remedied by modifying async-bytecomp such that call-back's capture the value of the async-byte-compile-log-file. In my opinion, such a solution, while seems like a viable one, puts too much burden on the user, to set up everyting correctly.
  2. Set up Exordium's CI to block async--package-compile by setting async-bytecomp-allowed-packages to nil when in Exordium's CI. I don't particularly like this approach as it will be changing how things interacts with each other. After all that's the Exordium's CI that caught this very issue.
  3. Set up Exordium's CI to turn the compilation off by switching async-bytecomp-package-mode off, for example by adding (eval-after-load 'helm-core (async-bytecomp-package-mode -1)). At first glance it seems similar to the above, just different means. But it's also more fragile, should some other package turn the async-bytecomp-package-mode on.
  4. Set up Exordium's CI to modify async-quiet-switch to include something like --eval (defun ask-user-about-lock ()...). I don't think it's the intended usage for the variable though, and it seems more like a workaround at best than a proper solution. It also puts a burden on Exordium, should the default value of async-quiet-switch change in the future and the package implementation would rely on these new defaults.
  5. Extend the async-byte-recompile-directory code to redefine the ask-user-about-lock to use more sophisticated lock contention avoidance strategy. For example something based on the current Exordium attempt. I guess this wouldn't prevent concurrent access to the log file, with either having data corruption when two or more Emacs sub processes write there or lock contention preventing making progress by more than one Emacs sub process.
  6. Extend either async-byte-recompile-directory or async-start with user customizable code that is executed before the main asynchronous code. Exordium's CI could set it, for example to redefine ask-user-about-lock the the strategy. While it is more customiseable than the above, it again, makes using the functionality more complicated to the end user, while the immediate solution seem to suffer from the same data corruption and lock contention issues.

This is to avoid log file contention when multiple packages are compiled in
rapid succession.

Update the default value `async-byte-compile-log-file`. The
`temporary-file-directory` (implicitly) is used as a location where to store
temporary compilation logs. Update to the semantic is done with backward
compatibility in mind, that is to respect the directory part should it be set
by user.

Add `async-bytecomp--comp-buffer-to-file` macro to be used (`macroexpand`ed) in
`async-byte-recompile-directory` and `async-byte-compile-file`. Pass value
returned from expanded macro execution in child processes to
`async-bytecomp--file-to-comp-buffer` call-back.
@thierryvolpiatto
Copy link
Collaborator

thierryvolpiatto commented Nov 25, 2024 via email

@pkryger
Copy link
Author

pkryger commented Nov 25, 2024

Hello, did you try to install your packages with async-package-do-action (see async-package.el), this ensure your packages are installed sequentially (in only one child process) and avoid race conditions.

Thank you very much for a quick response @thierryvolpiatto, and a new ideas to explore.

I am afraid, that Exordium uses use-package to manage packages installation (via, use-package-always-ensure) and we like to keep it that way. It will be cumbersome to have a list of packages in one place and their configuration spread amongst other files. We already had something like that in the past and one of the reasons to switch to use-package was the easiness of managing list of installed packages. Not to mention the list of packages changes depending on ones specific configuration.

I took a look at async-package-do-action, and should I understand the original issue with async-bytecomp-package-mode correctly, it suffers from the same race condition as well. Please note that the async-butecom-package-mode advises package--compile. The latter is called by package-install (via package-download-transaction, package-install-from-archive, and package-unpack) as a part of the loop that installs each package in sequence. When async-bytecomp-package-mode is on, and user requested more than one package to be installed, the original lock contention may still occur. For example, when calling (async-packcage-do-action 'install '(pkg1 pkg2) (make-temp-file "acync-package-do-install.log.")) the following could happen (wide table, please scroll):

|------+-------------------------------------+-------------------------+--------------------------------------+--------------------------------------|
| Time | Parent Emacs                        | Async install pkg1 pkg2 | Async bytecomp pkg1                  | Async bytecomp pkg2                  |
|------+-------------------------------------+-------------------------+--------------------------------------+--------------------------------------|
| t0   | async-package-do-action             |                         |                                      |                                      |
| t1   |                                     | package-install pkg1    |                                      |                                      |
| t2   |                                     | package--compile pkg1   |                                      |                                      |
| t3   |                                     |                         | compilation of pkg1 starts           |                                      |
| t4   |                                     | package-install pkg2    |                                      |                                      |
| t5   |                                     | package--compile pkg2   |                                      |                                      |
| t6   |                                     | installation finishes   |                                      | compilation of pkg2 starts           |
| t7   | (2)read async-byte-compile-log-file |                         | compilation of pkg1 finishes         | compilation of pkg2 finishes         |
| t8   |                                     |                         | (1)write async-byte-compile-log-file | (1)write async-byte-compile-log-file |
|      |                                     |                         |                                      |                                      |

I denoted the issue in question as (1): two async processes are trying to write to the same directory.

However, this chart revealed another issue. Not only my change interferes with async-package-do-action and its usage of async-byte-compile-log-file (and I should have checked it earlier), but also the async bytecomp processes are racing with the asynchronous installation the issue (2). I guess the async install process needs to wait for child compilation processes to finish before it can return.

Please let me know if above concerns have merit, or I missed something. I am happy to work on fixes if these are legitimate concerns.

@pkryger
Copy link
Author

pkryger commented Nov 25, 2024

My bad. The issue (2) will not happen, as the async installation process doesn't enable async-bytecomp-package-mode.

However, I still think the issue similar to the (1) above is a legitimate concern, when async-bytecomp-package-mode is enabled. The difference from the scenario above is that it is the parent Emacs that calls package-install multiple times. Hence, I think this PR still has merit. With the caveat that the async-package-do-action may need to be updated. Alternatively, just return to the original value of async-byte-compile-log-file.

@thierryvolpiatto
Copy link
Collaborator

thierryvolpiatto commented Nov 25, 2024 via email

@pkryger
Copy link
Author

pkryger commented Nov 25, 2024

[...] but here is it a real advantage to run the packages installation in parallel i.e. (dolist (pkg packages) (async-start (lambda () (package-install pkg)))) instead of looping in the child emacs?

I don't think this PR is about packages installation. It's about byte compilation that happens as a part of an installation process. Let me try to rephrase where my original issue came from.

Exordium is an Emacs configuration that installs a few dozens of packages. One of them is helm. The latter unconditionally turn's on async-bytecomp-package-mode. This in turn causes the issue occur when more packages are installed by Exordium (in sequence, from a parent process). Although package installation happens in serially, the byte compilation step is deferred to asynchronous processes which all try to write to async-byte-compile-log-file. What's more the async-byte-compile-log-file can be concurrently read by the parent process, causing even more contention.

Should helm refrain from turning the async-bytecomp-package-mode my original issue would have never appear.

  • Is it worthwhile to run dozen of emacs instances even if in theory it would be faster to parralelize?

Probably not at all. I guess some speed up may happen when each of such processes downloads and byte compiles on its own, then yeah - it may be faster, but with caveats below.

  • What if installing package A and package B which need A as dependency? when installing B it would try to install A while A is currently already installing in another emacs instance.

Believe me - been there, done that, just on a bigger scale (10 000+ [sic!] "packages" with dependencies all over the place). The best thing we came up with was building a dependency tree and processing it level by level.

@thierryvolpiatto
Copy link
Collaborator

thierryvolpiatto commented Nov 25, 2024 via email

This follows change in 8ef970f, ensuring that
`async-package-do-action` follows the same logic as `async-byte-compile-file`
and `async-byte-compile-directory`.
@pkryger
Copy link
Author

pkryger commented Nov 25, 2024

[...] Therefore I have disabled async-bytecomp-package-mode in Helm.

Thank you!

However, I guess it is not enough as you may have same issue with other packages using this mode [...]

I think this is where I was coming from - the async-bytecomp-package-mode could be turned on somewhere without a notice.

I installed a change to neutralize async-bytecomp in async-package, so if you do the same perhaps your issue is fixed and we don't have to make modifications to async-bytecomp. Let me know if that's ok for you.

I think this is a viable solution. Although I'd appreciate a bit of a speed up on Exordium's first start when all packages are downloaded and installed, I can settle for correctness.

Regardless, I added a commit to use similar approach in async-package-do-action, but feel free to close this PR if you believe the async-bytecomp usage should be discouraged. Or should it be perhaps obsoleted/deprecated?

Thanks to raise this issue.

No problem, glad I could help.

@thierryvolpiatto
Copy link
Collaborator

thierryvolpiatto commented Nov 25, 2024 via email

pkryger added a commit to pkryger/exordium that referenced this pull request Nov 26, 2024
This had been prompted by CI failures, but decided to turn the mode early in
the init process.

See jwiegley/emacs-async#194 for more discussion.
pkryger added a commit to pkryger/exordium that referenced this pull request Nov 26, 2024
This had been prompted by CI failures, but decided to turn the mode early in
the init process.

See jwiegley/emacs-async#194 for more discussion.
@pkryger
Copy link
Author

pkryger commented Nov 26, 2024

You are right, nowadays perhaps it is no more needed, thus the helm package manager (M-x helm-packages) is now async by default. Therefore I have disabled async-bytecomp-package-mode in Helm. However, I guess it is not enough as you may have same issue with other packages using this mode (magit was using it, not sure if it still the case), I installed a change to neutralize async-bytecomp in async-package, so if you do the same perhaps your issue is fixed and we don't have to make modifications to async-bytecomp. Let me know if that's ok for you.

I can understand you're reluctant making changes to an older package. Ultimately, it's been in use for a while (years?) and the issue has not been risen so far. I implemented a workaround as I think the asynchronous byte compilation may be beneficial for Exordium. Basically advise the async-byte-compile-file and async-byte-compile-directory as well as async-bytecomp--file-to-comp-buffer with binding async-byte-compile-log-file to a temporary file. However, if the workaround proves to cause more harm than good I am fine with neutralising the async-bytecomp-package-mode as you did in 48dafc8.

@thierryvolpiatto
Copy link
Collaborator

thierryvolpiatto commented Nov 26, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants