RFC-0002: Trace Clock Synchronization for Multi-Trace Merging #3104
LalitMaganti
started this conversation in
Team Design Discussions
Replies: 2 comments
-
📝 RFC Document Updated View changes: Commit History |
Beta Was this translation helpful? Give feedback.
0 replies
-
📝 RFC Document Updated View changes: Commit History |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
📄 RFC Doc: 0002-multitrace-clock-sync.md
Trace Clock Synchronization for Multi-Trace Merging
Authors: @LalitMaganti
Status: Draft
Overview
This document outlines the design and implementation plan for handling clock
synchronization when merging multiple trace files in ZIP/TAR archives. The goal
is to establish a unified global clock domain that allows accurate temporal
correlation of events across different trace sources, building upon the existing
single-trace clock synchronization infrastructure.
Key Terminology
This proposal introduces several key concepts that are used throughout the
document:
Clock Domain Categories
Explicit Clocks: Trace formats that contain definitive clock domain
information within the trace data itself (e.g., Proto traces with
ClockSnapshot
packets). These clocks cannot and should not be overridden byexternal metadata.
Semi-explicit Clocks: Trace formats that have established clock domain
conventions hardcoded in their parsers, but can be overridden via external
metadata when necessary (e.g., Systrace defaults to
BUILTIN_CLOCK_BOOTTIME
but can be configured to use other clocks).
Non-explicit Clocks: Trace formats that contain no clock domain
information and require external metadata specification to be properly
synchronized (e.g., Chrome JSON traces).
Multi-Trace Concepts
Sidecar Metadata: A JSON file (
merged_trace_metadata.json
) included alongsidetrace files in ZIP/TAR archives that specifies clock domains and
synchronization information for traces that need it.
Primary Trace: The trace file that defines the global clock domain for the
entire multi-trace archive. All other traces are synchronized relative to this
trace's timeline.
Global Clock Domain: The unified time reference established by the primary
trace, into which all other trace timestamps are converted for display on a
common timeline.
Perfetto Architecture Terms
ArchiveEntry: A sorting mechanism used to determine processing order of
files within ZIP/TAR archives, ensuring proto traces are processed first.
ForwardingTraceParser: The trace processor component that handles
individual trace files within multi-trace archives, delegating to
format-specific parsers.
ClockSynchronizer: The existing single-trace clock synchronization
infrastructure that handles multiple clock domains within individual traces
via graph-based pathfinding.
Clock Types and Components
BUILTIN_CLOCK_BOOTTIME: System boot time clock that includes time spent in
suspend/sleep states, preferred for trace synchronization.
BUILTIN_CLOCK_MONOTONIC: Monotonic clock that excludes suspend time but
provides steady progression, commonly used as a fallback.
ClockSnapshot: Proto message containing timestamp readings from multiple
clock domains at the same moment, used to establish relationships between
different clocks.
Error Handling Concepts
Soft Errors: Non-fatal errors that allow trace processing to continue
while dropping problematic data and providing user feedback through
statistics.
Graceful Degradation: The ability to continue processing traces even when
some components fail, preserving as much data as possible while clearly
reporting what was lost.
Hard Errors: Fatal errors that completely stop trace processing, avoided
in this proposal in favor of soft error approaches.
Trace Format Terms
Proto Traces: Traces using Perfetto's native protobuf format, which
contain rich metadata including explicit clock synchronization information.
ZIP/TAR Archives: Compressed archive formats that can contain multiple
trace files, requiring special handling to process each contained trace
appropriately.
Tokenizer: The component responsible for parsing raw trace data and
converting it into structured events that the trace processor can analyze.
Current State Analysis
Existing Architecture
Perfetto already has sophisticated clock synchronization for single traces:
via graph-based pathfinding
MultiTraceOpen
plugin allows loading multipletrace files
independently
information
Current Processing Pipeline:
Current Problems
global clock domain (typically
BUILTIN_CLOCK_BOOTTIME
) regardless of theiractual clock source. For example, Chrome JSON traces implicitly assume
BOOTTIME
even when they may actually use a different clock domain.domains appear on the same timeline with wrong relative timing, making
cross-trace analysis unreliable or misleading.
domain used by traces that don't explicitly declare it (Category 2 and 3
traces).
clock domain mismatches or way to correct them.
Enhancement Goals
Primary Objectives
non-explicit clock traces appropriately based on their format characteristics
domains and offsets for traces via JSON metadata
clock metadata while preserving non-temporal data
multi-trace dialog
unchanged. Multi-trace behavior is preserved for explicit and semi-explicit
traces, but non-explicit traces (e.g., JSON) will require metadata or drop
timestamped events.
Success Criteria
Clock Categorization Framework
Based on analysis of trace format parsers, traces fall into three categories:
Category 1: Explicit Clocks (no override allowed)
These formats contain explicit clock domain information that must be respected:
ClockSnapshot
packets with explicit clockrelationships
(
BUILTIN_CLOCK_PERF
when available)information
Category 2: Semi-explicit Clocks (convention with override)
These formats have established clock conventions but allow metadata override:
BUILTIN_CLOCK_BOOTTIME
or specifiedftrace clock
BUILTIN_CLOCK_MONOTONIC
Category 3: Non-explicit Clocks (requires sidecar metadata)
These formats lack inherent clock information and require external
specification:
Container Types (special handling)
Implementation Plan
Phase 1: ClockTracker Architecture Refactoring
Status: Not Started
Task 1.1: Separate ClockSynchronizer into Pure Algorithm and Coordination Layers
Files:
src/trace_processor/util/clock_synchronizer.h
src/trace_processor/importers/common/clock_tracker.h
src/trace_processor/importers/common/clock_tracker.cc
Priority: High (foundational architectural change)
Dependencies: None
Priority Field Explanation: The Priority field indicates the importance and
urgency of completing each task. High priority tasks are foundational or
critical path items that block subsequent work. Medium priority tasks improve
user experience or enable additional features. This prioritization helps
developers understand which tasks should be completed first when resources are
limited.
Current Architecture Problems:
The current
ClockSynchronizer
conflates two distinct responsibilities:algorithms)
evolving)
This makes it difficult to add multi-trace coordination without disrupting the
core algorithms.
New Two-Layer Architecture:
Parser Usage Patterns by Category:
Category 1 (Explicit - Proto, Perf with explicit clocks):
Category 2 (Semi-explicit - Gecko, Systrace):
Category 3 (Non-explicit - JSON):
Implementation Steps:
ClockGraphEngine
tp: refactor ClockSynchronizer into algorithm and coordination layers
Task 1.2: Add Multi-Trace Clock Statistics
Files:
src/trace_processor/storage/stats.h
Priority: High (essentialfor user feedback)
Dependencies: Task 1.1
New Statistics:
Implementation Steps:
PERFETTO_TP_STATS
macrotp: add statistics for multi-trace clock synchronization
Phase 2: Parser Integration with New ClockTracker
Status: Not Started
Task 2.1: Update All Parsers to Use New ClockTracker Interface
Files:
ForwardingTraceParser
Priority: High (core functionality)
Dependencies: Phase 1 complete
Parser Refactoring by Category:
Category 1 Parsers (Explicit Clocks): Update parsers that currently call
SetTraceTimeClock()
directly:Category 2 Parsers (Semi-explicit Clocks): Update parsers to fetch
configured clock or use default:
Category 3 Parsers (Non-explicit Clocks): Update parsers to require
configuration or drop events:
Trace File ID Integration: Update
ForwardingTraceParser
to pass trace fileID to parsers:
Implementation Steps:
tp: update all parsers to use new ClockTracker interface
Task 2.2: Configure Individual Trace Parsers for Clock Handling
Files:
src/trace_processor/forwarding_trace_parser.cc
Priority: High (enables per-trace clock configuration)
Dependencies: Task 2.1
Clock Configuration Strategy:
Per-Tokenizer Modifications:
Implementation Steps:
tp: configure individual trace parsers for multi-trace clock handling
Phase 3: Sidecar Metadata Implementation
Status: Not Started
Task 3.1: Define and Implement Sidecar JSON Metadata Schema
Files:
Dependencies: Phase 2 complete
Metadata Schema Definition:
Key Schema Elements:
Archive Integration:
Implementation Steps:
tp: implement sidecar JSON metadata for multi-trace clock configuration
Task 3.2: Error Reporting and User Guidance
Files:
Priority: Medium (user experience improvement)
Dependencies: Task 3.1
Error Reporting Enhancements:
failures
Error Message Examples:
multi_trace_clock_metadata_missing
→ "Some traces require clockconfiguration. Please specify clock settings for: chrome_trace.json"
multi_trace_events_dropped_no_clock
→ "Dropped 1,234 events from traceslacking clock metadata. Timeline may be incomplete."
Implementation Steps:
ui: improve error reporting for multi-trace clock issues
Phase 4: UI Integration and User Experience
Status: Not Started
Task 4.1: Extend Multi-Trace Dialog for Clock Configuration
Files:
ui/src/core_plugins/dev.perfetto.MultiTraceOpen/multi_trace_modal.ts
ui/src/core_plugins/dev.perfetto.MultiTraceOpen/multi_trace_controller.ts
Priority: Medium (user-facing functionality)
Dependencies: Phase 3 complete (needs JSON schema)
UI Workflow Enhancement:
needing configuration
metadata
merged_trace_metadata.json
from UI settingsClock Configuration UI Elements:
assumptions
global clock domain
defaults
required clocks
traces
TraceAnalyzer Integration:
Implementation Steps:
ui: add clock configuration to multi-trace dialog
Phase 5: Testing and Validation
Status: Not Started
Task 5.1: Comprehensive Multi-Trace Testing
Files:
Priority: High (ensure correctness)
Dependencies: All previous phases
Testing Scenarios:
traces
formats
Task 4.2: Documentation and Examples
Files:
Priority: Medium (user enablement)
Dependencies: Task 4.1
Documentation Deliverables:
multi-trace setup
supported format
scenarios
Progress Tracking
Phase 1 Progress: 0/2 tasks complete
Layers
Phase 2 Progress: 0/2 tasks complete
Phase 3 Progress: 0/2 tasks complete
Phase 4 Progress: 0/2 tasks complete
Phase 5 Progress: 0/1 tasks complete
Key Implementation Notes
Clock Selection Priority
Global Clock Domain Selection:
is_primary: true
from metadataprimary specified
information
during processing
Error Handling Philosophy
Soft Error Approach:
reports
issues
Performance Considerations
Metadata Processing:
checks
processor context
Backward Compatibility
Behavior Changes and Compatibility:
etc.)
but can be overridden with metadata
others will drop timestamped events without metadata (previously worked with
assumed clock domain)
Future Enhancements (Out of Scope)
Potential Follow-up Work
relationships
sync
relationships
synchronization
architectures
Meeting Notes Integration
This proposal directly addresses requirements identified in the September 24,
2025 meeting:
Critical Requirements Addressed
established before processing
processed first
display with clear errors
specification
clock traces appropriately
clock metadata
Non-Interactive Processing Maintained
Git Workflow and CL Stacking
Branch Strategy
Use
git new-branch --parent <parent-branch> dev/lalitm/<branch-name>
to createa proper CL stack:
Commit Points
Each task should result in a separate commit/CL:
tp: add sidecar JSON metadata schema for multi-trace clock sync
tp: add statistics for multi-trace clock synchronization
tp: integrate clock metadata parsing into archive processing
tp: configure individual trace parsers for multi-trace clock handling
ui: add clock configuration to multi-trace dialog
ui: improve error reporting for multi-trace clock issues
Testing and Validation
Before each commit:
Usage Instructions for Incremental Implementation
git new-branch --parent
for each tasktestable
state
comprehensive testing
implementation reveals new requirements
This proposal provides a comprehensive roadmap for implementing multi-trace
clock synchronization while maintaining backward compatibility and providing
excellent user experience through both programmatic and UI interfaces.
💬 Discussion Guidelines:
Beta Was this translation helpful? Give feedback.
All reactions