Skip to content

Commit e17cae0

Browse files
author
Walter Erquinigo
committed
[trace][intel pt] Fix per-psb packet decoding
The per-PSB packet decoding logic was wrong because it was assuming that pt_insn_get_sync_offset was being udpated after every PSB. Silly me, that is not true. It returns the offset of the PSB packet after invoking pt_insn_sync_forward regardless of how many PSBs are visited later. Instead, I'm now following the approach described in https://github.com/intel/libipt/blob/master/doc/howto_libipt.md#parallel-decode for parallel decoding, which is basically what we need. A nasty error that happened because of this is that when we had two PSBs (A and B), the following was happening 1. PSB A was processed all the way up to the end of the trace, which includes PSB B. 2. PSB B was then processed until the end of the trace. The instructions emitted by step 2. were also emitted as part of step 1. so our trace had duplicated chunks. This problem becomes worse when you many PSBs. As part of making sure this diff is correct, I added some other features that are very useful. - Added a "synchronization point" event to the TraceCursor, so we can inspect when PSBs are emitted. - Removed the single-thread decoder. Now the per-cpu decoder and single-thread decoder use the same code paths. - Use the query decoder to fetch PSBs and timestamps. It turns out that the pt_insn_sync_forward of the instruction decoder can move past several PSBs (this means that we could skip some TSCs). On the other hand, the pt_query_sync_forward method doesn't skip PSBs, so we can get more accurate sync events and timing information. - Turned LibiptDecoder into PSBBlockDecoder, which decodes single PSB blocks. It is the fundamental processing unit for decoding. - Added many comments, asserts and improved error handling for clarity. - Improved DecodeSystemWideTraceForThread so that a TSC is emitted always before a cpu change event. This was a bug that was annoying me before. - SplitTraceInContinuousExecutions and FindLowestTSCInTrace are now using the query decoder, which can identify precisely each PSB along with their TSCs. - Added an "only-events" option to the trace dumper to inspect only events. I did extensive testing and I think we should have an in-house testing CI. The LLVM buildbots are not capable of supporting testing post-mortem traces of hundreds of megabytes. I'll leave that for later, but at least for now the current tests were able to catch most of the issues I encountered when doing this task. A sample output of a program that I was single stepping is the following. You can see that only one PSB is emitted even though stepping happened! ``` thread #1: tid = 3578223 0: (event) trace synchronization point [offset = 0x0xef0] a.out`main + 20 at main.cpp:29:20 1: 0x0000000000402479 leaq -0x1210(%rbp), %rax 2: (event) software disabled tracing 3: 0x0000000000402480 movq %rax, %rdi 4: (event) software disabled tracing 5: (event) software disabled tracing 6: 0x0000000000402483 callq 0x403bd4 ; std::vector<int, std::allocator<int>>::vector at stl_vector.h:391:7 7: (event) software disabled tracing a.out`std::vector<int, std::allocator<int>>::vector() at stl_vector.h:391:7 8: 0x0000000000403bd4 pushq %rbp 9: (event) software disabled tracing 10: 0x0000000000403bd5 movq %rsp, %rbp 11: (event) software disabled tracing ``` This is another trace of a long program with a few PSBs. ``` (lldb) thread trace dump instructions -E -f thread #1: tid = 3603082 0: (event) trace synchronization point [offset = 0x0x80] 47417: (event) software disabled tracing 129231: (event) trace synchronization point [offset = 0x0x800] 146747: (event) software disabled tracing 246076: (event) software disabled tracing 259068: (event) trace synchronization point [offset = 0x0xf78] 259276: (event) software disabled tracing 259278: (event) software disabled tracing no more data ``` Differential Revision: https://reviews.llvm.org/D131630
1 parent 2366c6a commit e17cae0

20 files changed

+751
-649
lines changed

lldb/include/lldb/Target/TraceCursor.h

+9
Original file line numberDiff line numberDiff line change
@@ -271,6 +271,15 @@ class TraceCursor {
271271
/// The approximate wall clock time for the trace item, or \a llvm::None
272272
/// if not available.
273273
virtual llvm::Optional<double> GetWallClockTime() const = 0;
274+
275+
/// Get some metadata associated with a synchronization point event. As
276+
/// different trace technologies might have different values for this,
277+
/// we return a string for flexibility.
278+
///
279+
/// \return
280+
/// A string representing some metadata associated with a
281+
/// \a eTraceEventSyncPoint event. \b None if no metadata is available.
282+
virtual llvm::Optional<std::string> GetSyncPointMetadata() const = 0;
274283
/// \}
275284

276285
protected:

lldb/include/lldb/Target/TraceDumper.h

+3
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ struct TraceDumperOptions {
3434
bool show_timestamps = false;
3535
/// Dump the events that happened between instructions.
3636
bool show_events = false;
37+
/// Dump events and none of the instructions.
38+
bool only_events = false;
3739
/// For each instruction, print the instruction kind.
3840
bool show_control_flow_kind = false;
3941
/// Optional custom id to start traversing from.
@@ -63,6 +65,7 @@ class TraceDumper {
6365
lldb::addr_t load_address;
6466
llvm::Optional<double> timestamp;
6567
llvm::Optional<uint64_t> hw_clock;
68+
llvm::Optional<std::string> sync_point_metadata;
6669
llvm::Optional<llvm::StringRef> error;
6770
llvm::Optional<lldb::TraceEvent> event;
6871
llvm::Optional<SymbolInfo> symbol_info;

lldb/include/lldb/lldb-enumerations.h

+6-3
Original file line numberDiff line numberDiff line change
@@ -1161,15 +1161,18 @@ enum SaveCoreStyle {
11611161

11621162
/// Events that might happen during a trace session.
11631163
enum TraceEvent {
1164-
/// Tracing was disabled for some time due to a software trigger
1164+
/// Tracing was disabled for some time due to a software trigger.
11651165
eTraceEventDisabledSW,
1166-
/// Tracing was disable for some time due to a hardware trigger
1166+
/// Tracing was disable for some time due to a hardware trigger.
11671167
eTraceEventDisabledHW,
11681168
/// Event due to CPU change for a thread. This event is also fired when
11691169
/// suddenly it's not possible to identify the cpu of a given thread.
11701170
eTraceEventCPUChanged,
1171-
/// Event due to a CPU HW clock tick
1171+
/// Event due to a CPU HW clock tick.
11721172
eTraceEventHWClockTick,
1173+
/// The underlying tracing technology emitted a synchronization event used by
1174+
/// trace processors.
1175+
eTraceEventSyncPoint,
11731176
};
11741177

11751178
// Enum used to identify which kind of item a \a TraceCursor is pointing at

lldb/source/Commands/CommandObjectThread.cpp

+5
Original file line numberDiff line numberDiff line change
@@ -2178,6 +2178,11 @@ class CommandObjectTraceDumpInstructions : public CommandObjectParsed {
21782178
m_dumper_options.json = true;
21792179
break;
21802180
}
2181+
case 'E': {
2182+
m_dumper_options.only_events = true;
2183+
m_dumper_options.show_events = true;
2184+
break;
2185+
}
21812186
case 'C': {
21822187
m_continue = true;
21832188
break;

lldb/source/Commands/Options.td

+4
Original file line numberDiff line numberDiff line change
@@ -1160,6 +1160,10 @@ let Command = "thread trace dump instructions" in {
11601160
def thread_trace_dump_instructions_show_events : Option<"events", "e">,
11611161
Group<1>,
11621162
Desc<"Dump the events that happened during the execution of the target.">;
1163+
def thread_trace_dump_instruction_only_events : Option<"only-events", "E">,
1164+
Group<1>,
1165+
Desc<"Dump only the events that happened during the execution of the "
1166+
"target. No instrutions are dumped.">;
11631167
def thread_trace_dump_instructions_continue: Option<"continue", "C">,
11641168
Group<1>,
11651169
Desc<"Continue dumping instructions right where the previous invocation of "

lldb/source/Plugins/Trace/intel-pt/DecodedThread.cpp

+13-24
Original file line numberDiff line numberDiff line change
@@ -16,18 +16,6 @@ using namespace lldb_private;
1616
using namespace lldb_private::trace_intel_pt;
1717
using namespace llvm;
1818

19-
bool lldb_private::trace_intel_pt::IsLibiptError(int libipt_status) {
20-
return libipt_status < 0;
21-
}
22-
23-
bool lldb_private::trace_intel_pt::IsEndOfStream(int libipt_status) {
24-
return libipt_status == -pte_eos;
25-
}
26-
27-
bool lldb_private::trace_intel_pt::IsTscUnavailable(int libipt_status) {
28-
return libipt_status == -pte_no_time;
29-
}
30-
3119
char IntelPTError::ID;
3220

3321
IntelPTError::IntelPTError(int libipt_error_code, lldb::addr_t address)
@@ -103,6 +91,11 @@ DecodedThread::GetInstructionLoadAddress(uint64_t item_index) const {
10391
return m_item_data[item_index].load_address;
10492
}
10593

94+
lldb::addr_t
95+
DecodedThread::GetSyncPointOffsetByIndex(uint64_t item_index) const {
96+
return m_psb_offsets.find(item_index)->second;
97+
}
98+
10699
ThreadSP DecodedThread::GetThread() { return m_thread_sp; }
107100

108101
DecodedThread::TraceItemStorage &
@@ -116,9 +109,17 @@ DecodedThread::CreateNewTraceItem(lldb::TraceItemKind kind) {
116109
return m_item_data.back();
117110
}
118111

112+
void DecodedThread::NotifySyncPoint(lldb::addr_t psb_offset) {
113+
m_psb_offsets.try_emplace(GetItemsCount(), psb_offset);
114+
AppendEvent(lldb::eTraceEventSyncPoint);
115+
}
116+
119117
void DecodedThread::NotifyTsc(TSC tsc) {
120118
if (m_last_tsc && (*m_last_tsc)->second.tsc == tsc)
121119
return;
120+
if (m_last_tsc)
121+
assert(tsc >= (*m_last_tsc)->second.tsc &&
122+
"We can't have decreasing times");
122123

123124
m_last_tsc =
124125
m_tscs.emplace(GetItemsCount(), TSCRange{tsc, 0, GetItemsCount()}).first;
@@ -179,9 +180,6 @@ void DecodedThread::AppendInstruction(const pt_insn &insn) {
179180
}
180181

181182
void DecodedThread::AppendError(const IntelPTError &error) {
182-
// End of stream shouldn't be a public error
183-
if (IsEndOfStream(error.GetLibiptErrorCode()))
184-
return;
185183
CreateNewTraceItem(lldb::eTraceItemKindError).error =
186184
ConstString(error.message()).AsCString();
187185
}
@@ -200,15 +198,6 @@ void DecodedThread::LibiptErrorsStats::RecordError(int libipt_error_code) {
200198
total_count++;
201199
}
202200

203-
void DecodedThread::RecordTscError(int libipt_error_code) {
204-
m_tsc_errors_stats.RecordError(libipt_error_code);
205-
}
206-
207-
const DecodedThread::LibiptErrorsStats &
208-
DecodedThread::GetTscErrorsStats() const {
209-
return m_tsc_errors_stats;
210-
}
211-
212201
const DecodedThread::EventsStats &DecodedThread::GetEventsStats() const {
213202
return m_events_stats;
214203
}

lldb/source/Plugins/Trace/intel-pt/DecodedThread.h

+11-28
Original file line numberDiff line numberDiff line change
@@ -20,15 +20,6 @@
2020
namespace lldb_private {
2121
namespace trace_intel_pt {
2222

23-
/// libipt status utils
24-
/// \{
25-
bool IsLibiptError(int libipt_status);
26-
27-
bool IsEndOfStream(int libipt_status);
28-
29-
bool IsTscUnavailable(int libipt_status);
30-
/// \}
31-
3223
/// Class for representing a libipt decoding error.
3324
class IntelPTError : public llvm::ErrorInfo<IntelPTError> {
3425
public:
@@ -173,6 +164,10 @@ class DecodedThread : public std::enable_shared_from_this<DecodedThread> {
173164
/// The requested cpu id, or \a LLDB_INVALID_CPU_ID if not available.
174165
lldb::cpu_id_t GetCPUByIndex(uint64_t item_index) const;
175166

167+
/// \return
168+
/// The PSB offset associated with the given item index.
169+
lldb::addr_t GetSyncPointOffsetByIndex(uint64_t item_index) const;
170+
176171
/// Get a maximal range of trace items that include the given \p item_index
177172
/// that have the same TSC value.
178173
///
@@ -199,28 +194,12 @@ class DecodedThread : public std::enable_shared_from_this<DecodedThread> {
199194
/// The load address of the instruction at the given index.
200195
lldb::addr_t GetInstructionLoadAddress(uint64_t item_index) const;
201196

202-
/// Return an object with statistics of the TSC decoding errors that happened.
203-
/// A TSC error is not a fatal error and doesn't create gaps in the trace.
204-
/// Instead we only keep track of them as statistics.
205-
///
206-
/// \return
207-
/// An object with the statistics of TSC decoding errors.
208-
const LibiptErrorsStats &GetTscErrorsStats() const;
209-
210197
/// Return an object with statistics of the trace events that happened.
211198
///
212199
/// \return
213200
/// The stats object of all the events.
214201
const EventsStats &GetEventsStats() const;
215202

216-
/// Record an error decoding a TSC timestamp.
217-
///
218-
/// See \a GetTscErrors() for more documentation.
219-
///
220-
/// \param[in] libipt_error_code
221-
/// An error returned by the libipt library.
222-
void RecordTscError(int libipt_error_code);
223-
224203
/// The approximate size in bytes used by this instance,
225204
/// including all the already decoded instructions.
226205
size_t CalculateApproximateMemoryUsage() const;
@@ -235,6 +214,9 @@ class DecodedThread : public std::enable_shared_from_this<DecodedThread> {
235214
/// If this a new CPU, an event will be created.
236215
void NotifyCPU(lldb::cpu_id_t cpu_id);
237216

217+
/// Notify this object that a new PSB has been seen.
218+
void NotifySyncPoint(lldb::addr_t psb_offset);
219+
238220
/// Append a decoding error.
239221
void AppendError(const IntelPTError &error);
240222

@@ -294,20 +276,21 @@ class DecodedThread : public std::enable_shared_from_this<DecodedThread> {
294276
llvm::Optional<std::map<uint64_t, NanosecondsRange>::iterator>
295277
m_last_nanoseconds = llvm::None;
296278

297-
// The cpu information is stored as a map. It maps `instruction index -> CPU`
279+
// The cpu information is stored as a map. It maps `item index -> CPU`.
298280
// A CPU is associated with the next instructions that follow until the next
299281
// cpu is seen.
300282
std::map<uint64_t, lldb::cpu_id_t> m_cpus;
301283
/// This is the chronologically last CPU ID.
302284
llvm::Optional<uint64_t> m_last_cpu = llvm::None;
303285

286+
// The PSB offsets are stored as a map. It maps `item index -> psb offset`.
287+
llvm::DenseMap<uint64_t, lldb::addr_t> m_psb_offsets;
288+
304289
/// TSC -> nanos conversion utility.
305290
llvm::Optional<LinuxPerfZeroTscConversion> m_tsc_conversion;
306291

307292
/// Statistics of all tracing events.
308293
EventsStats m_events_stats;
309-
/// Statistics of libipt errors when decoding TSCs.
310-
LibiptErrorsStats m_tsc_errors_stats;
311294
/// Total amount of time spent decoding.
312295
std::chrono::milliseconds m_total_decoding_time{0};
313296
};

0 commit comments

Comments
 (0)