-
Notifications
You must be signed in to change notification settings - Fork 2
Developer Guide
FindHao edited this page Aug 31, 2025
·
5 revisions
-
src/cutracer.cu: NVBit callbacks, kernel filtering, SASS iteration, instrumentation dispatch -
src/instrument.cu: NVBit call injection helpers (opcode/reg/mem) -
src/inject_funcs.cu: device-sideinstrument_*functions pushing packets -
src/analysis.cu:recv_thread_fun, analysis dispatch, histogram CSV dump -
src/env_config.cu: env parsing, forcing modes, filters, intervals -
include/: public headers for types and interfaces
- Define/extend message types if needed, or reuse existing (
opcode_only,reg_info,mem_access). - In
recv_thread_fun, branch on header type and implement the analysis state machine. - Emit results using a stable file format and naming (
generate_kernel_log_basename). - Add an env switch under
CUTRACER_ANALYSISto enable it (consider auto-enabling minimal instrumentation).
- In
instrument_function_if_needed, collect needed operands/metadata when iterating SASS. - Implement an
instrument_*helper insrc/instrument.cuusingnvbit_insert_callandnvbit_add_call_arg_*. - Add the device-side callee in
src/inject_funcs.cuthat constructs and pushes a packet. - Gate enabling via
CUTRACER_INSTRUMENTand document interplay with analyses.
- Identify
CS2R SR_CLOCKLOopcode IDs during instrumentation per function. - Use them at analysis time to toggle collection per warp.
- During instrumentation, CUTracer records a per-function mapping
opcode_id -> SASS stringand the set of clock and EXIT opcode IDs. - Analyses can look up the mnemonic via
ctx_state->id_to_sass_map[f][opcode_id]after retrieving the currentCUfunctionfromkernel_launch_id.
- Each kernel launch is assigned a
kernel_launch_id. When the receiver observes a change, it finalizes and dumps data for the previous kernel. - Per-kernel iteration indices are tracked to build deterministic filenames.