-
Notifications
You must be signed in to change notification settings - Fork 576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scheduler should change pid+tid values to ensure unique in core-sharded-on-disk mixes #7401
Labels
Comments
We also won't support 32-bit builds with record_filter where there are only 32 bits to hold the tid/pid. (Long-term probably 32-bit should use the 64-bit on-disk format as it has timestamp issues as well.) |
derekbruening
added a commit
that referenced
this issue
Mar 27, 2025
In dynamic drmemtrace schedules combining multiple workloads, there can be pid or tid collisions. We solve that by having the scheduler set the workload ordinal in the top 32 bits of both the pid and tid. We ignore tids being 64-bit values on Mac; and we do not support unique values for 32-bit builds when trace_entry_t-based tools such as record_filter are used. Documents this, along with missing docs for cpuid and timestamp modifications. Updates the view tool and schedule stats to print tids in hex, to make it easier to distinguish the workload ordinal. Updates test templates and docs for this change. Adds scheduler unit tests. Here is what this looks like: =========================================================================== $ bin64/drrun -t drmemtrace -tool view -multi_indir ../src/clients/drcachesim/tests/drmemtrace.threadsig.x64.tracedir:../src/clients/drcachesim/tests/drmemtrace.threadsig.x64.tracedir -core_sharded -cores 1 -sched_quantum 10 2>&1 | less Output format: <--record#-> <--instr#->: <---tid---> <record details> ------------------------------------------------------------ 1 0: d51be <marker: version 7> 2 0: d51be <marker: filetype 0xe40> 3 0: d51be <marker: cache line size 64> ... ------------------------------------------------------------ 135 60: 1000d51be <marker: version 7> 136 60: 1000d51be <marker: filetype 0xe40> 137 60: 1000d51be <marker: cache line size 64> =========================================================================== Fixes #7401
This was referenced Mar 27, 2025
derekbruening
added a commit
that referenced
this issue
Mar 28, 2025
In dynamic drmemtrace schedules combining multiple workloads, there can be pid or tid collisions. We solve that by having the scheduler set the workload ordinal in the top 32 bits of both the pid and tid. We ignore tids being 64-bit values on Mac; and we do not support unique values for 32-bit builds when trace_entry_t-based tools such as record_filter are used. Documents this, along with missing docs for cpuid and timestamp modifications. Updates the view tool and schedule stats to print tids in hex, to make it easier to distinguish the workload ordinal. Updates test templates and docs for this change. Adds scheduler unit tests. Here is what this looks like: ``` drmemtrace -tool view -multi_indir ../src/clients/drcachesim/tests/drmemtrace.threadsig.x64.tracedir:../src/clients/drcachesim/tests/drmemtrace.threadsig.x64.tracedir -core_sharded -cores 1 -sched_quantum 10 2>&1 | less Output format: <--record#-> <--instr#->: <---tid---> <record details> ------------------------------------------------------------ 1 0: d51be <marker: version 7> 2 0: d51be <marker: filetype 0xe40> 3 0: d51be <marker: cache line size 64> ... ------------------------------------------------------------ 135 60: 1000d51be <marker: version 7> 136 60: 1000d51be <marker: filetype 0xe40> 137 60: 1000d51be <marker: cache line size 64> ``` Fixes #7401
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
For dynamic scheduling, the scheduler API allows easy identification of different workloads and threads within them, even when the same trace is included multiple times. But when we write out a core-sharded-on-disk trace and then read it in, that API no longer helps as it sees just one input. The pid and tid fields can be used to distinguish the interleaved original workloads and their threads, but those can collide, especially if one workload is included multiple times. We should have the scheduler replace the original pid and tid values to ensure uniqueness.
This would also help with analyzers that are currently using the pid/tid fields: we wouldn't have to migrate them to use the scheduler's input workload identifiers.
The proposal is to stick the workload ordinal in the upper 32 bits of the tid and pid. On Mac a thread is 64-bits already; we probably will just live with potential problems there for now.
The text was updated successfully, but these errors were encountered: