-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[C++20][Modules] The search complexity for header file info is too expensive with a lot of modules. #140860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@llvm/issue-subscribers-clang-modules Author: Michael Park (mpark)
Currently, for each header file `#include` or `import`, there is a look-up of this header file on each of the loaded PCMs. This results in a search complexity of `O(N*M)` where `N` is the # of includes, and `M` is the # of loaded PCMs. For large numbers, e.g. `N` = 10,000 and `M` = 1,000, this becomes very costly.
This problem was encountered previously inside Test CaseI wrote a bash script to generate the test case. It produces 1,000 empty header units, and 10,000 empty header files. The main file The script for the synthetic repro: #/usr/bin/env bash
CLANG=${CLANG:-clang++}
NUM_MODULES=${NUM_MODULES:-1000}
NUM_HEADERS=${NUM_HEADERS:-10000}
mkdir build && cd build
for i in $(seq 1 $NUM_MODULES); do > m${i}.h; done
seq 1 $NUM_MODULES | xargs -I {} -P $(nproc) bash -c "$CLANG -std=c++20 -fmodule-header m{}.h"
for i in $(seq 1 $NUM_HEADERS); do > h${i}.h; done
> a.cpp
for i in $(seq 1 $NUM_MODULES); do echo "import \"m${i}.h\";" >> a.cpp; done
for i in $(seq 1 $NUM_HEADERS); do echo "#include \"h${i}.h\"" >> a.cpp; done
echo "int main() {}" >> a.cpp
time $CLANG -std=c++20 -Wno-experimental-header-units -E a.cpp -o /dev/null \
$(for i in $(seq 1 $NUM_MODULES); do echo -n "-fmodule-file=m${i}.pcm "; done) > Note: I am aware that Code PathWe more or less start here: // Do a standard file entry lookup.
OptionalFileEntryRef FE = HeaderInfo.LookupFile(...); and proceed through the following:
Here we encounter this: ModuleMap::KnownHeader
HeaderSearch::findModuleForHeader(FileEntryRef File, bool AllowTextual,
bool AllowExcluded) const {
if (ExternalSource) {
// Make sure the external source has handled header info about this file,
// which includes whether the file is part of a module.
(void)getExistingFileInfo(File);
}
return ModMap.findModuleForHeader(File, AllowTextual, AllowExcluded);
} where there is an invocation to We finally arrive at: HeaderFileInfo ASTReader::GetHeaderFileInfo(FileEntryRef FE) {
HeaderFileInfoVisitor Visitor(FE);
ModuleMgr.visit(Visitor);
if (std::optional<HeaderFileInfo> HFI = Visitor.getHeaderFileInfo())
return *HFI;
return HeaderFileInfo();
} where Every PCM file contains a |
BTW, for this,
I remember I tried to do some of such optimizations in the downstream. But I didn't find any significant impacts, so I stopped. It is welcomed to optimize this if any one want to give it a try. |
I'm pretty sure we do still need it. I would expect: //--- A.h
#pragma once
#define FOO
//--- B.h
#include "A.h"
//--- C.cpp
import "B.h";
#include "A.h" // this #include should get skipped. where A.h is textual and B.h is a header unit. The actual issue with #78495 is that Clang does not do visibility tracking of HeaderFileInfo which causes other problems with submodules too. I think the right fix for this part of it is to fix this part of Clang. Named modules should always block HeaderFileInfo propagation because they don't export macros, but header units should always propagate it. Basically they should follow the exact same rules as the other preprocessor state does. Of course this doesn't fix the performance issue. There I have a preference for making chained hash tables work. |
This means using |
Uh oh!
There was an error while loading. Please reload this page.
Currently, for each header file
#include
orimport
, there is a look-up of this header file on each of the loaded PCMs. This results in a search complexity ofO(N*M)
whereN
is the # of includes, andM
is the # of loaded PCMs. For large numbers, e.g.N
= 10,000 andM
= 1,000, this becomes very costly.The synthetic test case I put together takes 14s user time, compared to 0.06s without modules.
In a real world scenario, the difference for us was 18s with modules and 3s without modules.
This problem was encountered previously inside
ASTWriter
(see #87848). It was solved there by only considering the local information rather than making the query out toASTReader
. I'm not sure that such a solution can be used in general. (CC @jansvoboda11)Test Case
I wrote a bash script to generate the test case. It produces 1,000 empty header units, and 10,000 empty header files. The main file
import
s the 1,000 header units, then#include
s 10,000 header files. We can measure the preprocessing time on this file. On my machine, this takes 14s, compared to 0.06s without modules.The script for the synthetic repro:
Code Path
We more or less start here:
// Do a standard file entry lookup. OptionalFileEntryRef FE = HeaderInfo.LookupFile(...);
and proceed through the following:
It->LookupFile(...);
:It
is an iterator over the search directories.HS.findUsableModuleForHeader(...);
suggestModule(...);
HS.findModuleForHeader(...);
Here we encounter this:
where there is an invocation to
(void)getExistingFileInfo(File);
, which callsExternalSource->GetHeaderFileInfo(FE);
.We finally arrive at:
where
ModuleMgr.visit(Visitor);
will visit every currently loaded PCM.Every PCM file contains a
HEADER_SEARCH_TABLE
record, which is an on-disk hash table. For header units, it has at least one entry in that table which is the header file itself. For named modules, we shouldn't need this, but it's still present today (see #78495). This means that every header unit has aHEADER_SEARCH_TABLE
with at least one entry in it.The text was updated successfully, but these errors were encountered: