You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm starting this issue to discuss a few strategies that could be used to speed up build times for interface packages. Instead of spamming issues, I have listed here four different strategies that could help accelerate builds. Also most/all of these affect other generators too (rosidl_typesupport, rosidl_typesupport_fastrtps, etc).
First, for reference, let's look at the current build process for interfaces (here px4_msgs, .ninja_log viewed with ui.perfetto.dev):
The build process consists of two parts: generating and compiling files. Both can be optimized.
Use of precompiled headers
For some source files, most of the "compilation" time is actually spent parsing the file. This is caused by includes resulting in large files. All messages have the same set of includes that are always present and need to be parsed for each translation unit.
Precompiled headers can significantly reduce the compilation time of these files.
Issues with precompiled headers:
Dealing with indirect includes. For instance, rosidl_typesupport_cpp includes __struct.hpp files (generated by rosidl_generator_cpp) which themselves include standard library headers that expand into thousands of lines of code. Those indirect dependencies would have to be directly included in the list of headers to precompile, or rosidl_generator_cpp would need to provide a header file containing them (that rosidl_typesupport_cpp could add to its list of precompiled headers).
For small projects, the time spent generating the precompiled header might be longer than the time saved by using it. Precompiled headers are processed by a single thread. For some projects, the time spent parsing the headers in parallel for each message might be shorter than the time spent generating the precompiled header.
On my machine, the gain from just using precompiled headers is not that high since most of the time is spent generating files (10% speedup for px4_msgs).
When a single message file is updated, every generator is run, which generates files with new update dates that Ninja will recompile even though only a single message changed. Most files are the same and do not need to be recompiled.
This wouldn't affect a clean build but would speed up rebuilds.
Multi-processing
The tasks of most generators could be multiprocessed. Most generators generate a set of files for each IDL they process. I haven't tested it myself, but @gavanderhoorn has reported a 5x speedup (albeit using a processor with a high thread count), see this discussion.
I expect the overhead of using multiprocessing might not be worth it for smaller packages.
Fix dependencies of the generators
The generators seem to run in four stages:
rosidl_generator_type_description
rosidl_generator_c[pp]
rosidl_typesupport_*
rosidl_generator_python
As far as I can tell, only rosidl_generator_c needs to wait for the output of rosidl_generator_type_description (to obtain hashes of the interfaces). The rest only need the IDL files. They could start directly. The actual compilation might have dependencies with the previous stages, but the file generation doesn't.
Looking at the above trace, it could speed up compilation by about 3 to 4x (for both large and small packages).
Final word
The first three strategies have the potential to worsen build times for smaller packages. Maybe the speedup for larger packages is worth it, but maybe they should just be optional?
The last method should be an improvement for every package. It's also one that can have the largest impact.
The text was updated successfully, but these errors were encountered:
thanks for continuing / starting the discussion here @TonyWelte.
re: parallelising file generation: some very ugly / quick changes I used to prototype can be found here and here (this parallelises generation, not when the generators are run btw)
I did not use any real profiling, just quick-and-dirty print tracing and going for the things that stood out the most.
Whether this approach makes sense for smaller packages would depend mostly I believe on whether these parts of the overall msg generation process are CPU or IO bound.
I'm starting this issue to discuss a few strategies that could be used to speed up build times for interface packages. Instead of spamming issues, I have listed here four different strategies that could help accelerate builds. Also most/all of these affect other generators too (rosidl_typesupport, rosidl_typesupport_fastrtps, etc).
First, for reference, let's look at the current build process for interfaces (here px4_msgs,
.ninja_log
viewed with ui.perfetto.dev):The build process consists of two parts: generating and compiling files. Both can be optimized.
Use of precompiled headers
For some source files, most of the "compilation" time is actually spent parsing the file. This is caused by includes resulting in large files. All messages have the same set of includes that are always present and need to be parsed for each translation unit.
Precompiled headers can significantly reduce the compilation time of these files.
Issues with precompiled headers:
rosidl_typesupport_cpp
includes__struct.hpp
files (generated byrosidl_generator_cpp
) which themselves include standard library headers that expand into thousands of lines of code. Those indirect dependencies would have to be directly included in the list of headers to precompile, orrosidl_generator_cpp
would need to provide a header file containing them (thatrosidl_typesupport_cpp
could add to its list of precompiled headers).On my machine, the gain from just using precompiled headers is not that high since most of the time is spent generating files (10% speedup for
px4_msgs
).Forks with precompiled headers added:
Propagate the update time of the message files
When a single message file is updated, every generator is run, which generates files with new update dates that Ninja will recompile even though only a single message changed. Most files are the same and do not need to be recompiled.
This wouldn't affect a clean build but would speed up rebuilds.
Multi-processing
The tasks of most generators could be multiprocessed. Most generators generate a set of files for each IDL they process. I haven't tested it myself, but @gavanderhoorn has reported a 5x speedup (albeit using a processor with a high thread count), see this discussion.
I expect the overhead of using multiprocessing might not be worth it for smaller packages.
Fix dependencies of the generators
The generators seem to run in four stages:
rosidl_generator_type_description
rosidl_generator_c[pp]
rosidl_typesupport_*
rosidl_generator_python
As far as I can tell, only
rosidl_generator_c
needs to wait for the output ofrosidl_generator_type_description
(to obtain hashes of the interfaces). The rest only need the IDL files. They could start directly. The actual compilation might have dependencies with the previous stages, but the file generation doesn't.Looking at the above trace, it could speed up compilation by about 3 to 4x (for both large and small packages).
Final word
The first three strategies have the potential to worsen build times for smaller packages. Maybe the speedup for larger packages is worth it, but maybe they should just be optional?
The last method should be an improvement for every package. It's also one that can have the largest impact.
The text was updated successfully, but these errors were encountered: