You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
perf(cargo-package): match certain path prefix with pathspec
`check_repo_state` checks the entire git repo status.
This is usually fine if you have only a few packages in a workspace.
For huge monorepos, it may hit performance issues.
For example,
on awslabs/aws-sdk-rust@2cbd34d
the workspace has roughly 434 members to publish.
`git ls-files` reported us 204379 files in this Git repository.
That means git may need to check status of all files 434 times.
That would be `204379 * 434 = 88,700,486` checks!
Moreover, the current algorithm is finding the intersection of
`PathSource::list_files` and `git status`.
It is an `O(n^2)` check.
Let's assume files are evenly distributed into each package,
so roughly 470 files per package.
If we're unlucky to have some dirty files, say 100 files.
We will have to do `470 * 100 = 47,000` times of path comparisons.
Even worse, because we `git status` everything in the repo,
we'll have to it for all members,
even when those dirty files are not part of the current package in question.
So it becomes `470 * 100 * 434 = 20,398,000`!
Instead of comparing with the status of the entire repository,
this patch use the magic pathspec[1] to tell git only reports
paths that match a certain path prefix.
This wouldn't help the `O(n^2)` algorithm,
but at least it won't check dirty files outside the current package.
[1]: https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec
0 commit comments