Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out format for sampling profiler data #121

Closed
emilsoman opened this issue May 31, 2015 · 6 comments
Closed

Figure out format for sampling profiler data #121

emilsoman opened this issue May 31, 2015 · 6 comments

Comments

@emilsoman
Copy link
Contributor

Here are my thoughts :

  1. Use a dictionary for storing strings in the sampling data (like filenames) and send that as the first payload. This will reduce size of payloads to be sent afterwords. See Message aggregator should prepare a string dictionary and send that as a message over to the client #120.
  2. Each sample will be of this format :
{
  timestamp: <timestamp in ms>,
  event_type: 'cpu_sample',
  correlation_id: <id indicating event this message is part of>,
  complete_message_count: <total number of messages this event is split into>,
  payload: [
    {
      timestamp: <timestamp when sample was collected>,
      frames: [
        {
          method_name: <name of method >,
          file: <filename>,
          line: <line no>,
          classpath: <Class path of method>,
          singleton_method: <true/false>
        }, #Frame 1
        ...
      ]
    }, #Sample 1
    ...
  ] # As many samples as we can fit in this payload
}
@emilsoman
Copy link
Contributor Author

We'll not implement any string dictionary in the first cut.

@emilsoman
Copy link
Contributor Author

I made a few changes after working on this feature a bit :

{
  event_type: cpu_samples
  timestamp: <timestamp in milliseconds>,
  correlation_id: <ID_INDICATING_EVENT_THIS_MESSAGE_IS_PART_OF>,
  sample_count: <total number of samples collected>,
  payload: [
    {
      timestamp: <timestamp when sample was collected>,
      frames: [
        {
          method_name: <name of method >,
          label: <method label (See below for details)>,
          file: <filename>,
          line: <line no>,
          singleton_method: <1/0>,
          thread_id: <thread id>
        }, #Frame 1
        ...
      ]
    }, #Sample 1
    ...
  ] # As many samples as we can fit in this payload
}

Ruby VM gives us something called a "full label" which will give us many details
about each frame in the call stack. Some examples:

"block (2 levels) in SampleClass#foobar" # foobar is an instance method of SampleClass. The frame is 2 blocks deep inside foobar.
"#{obj.inspect}.zab" # Frame is in zab, a singleton method defined on an object obj
"SampleClass#baz" # Frame is in baz, an instance method defined in SampleClass
"SampleClass.bar" # Frame is in bar, a class method defined in SampleClass

@emilsoman
Copy link
Contributor Author

@iffyuva @ishankhare07 Since we want to send samples as and when we collect them instead of waiting for the sampling to stop, we won't have a correlation_id or sample_count . We'll use the following format and send only one sample in each event (which will still be aggregated in an event_collection message ) :

{
  event_type: cpu_sample
  timestamp: <timestamp in milliseconds>,
  payload: [
    {
      method_name: <name of method >,
      label: <method label (See below for details)>,
      file: <filename>,
      line: <line no>,
      singleton_method: <1/0>,
      thread_id: <thread id>
    }, #Frame 1
    ...
  ] # Array of frames in the sample
}

@ishankhare07
Copy link

@emilsoman you mean something like this
this would be 1 message that we'll receive from server right?

{
  'event_type': 'cpu_samples',
  'payload': [
    {
      'file': 'dnbsfwyvcs',
      'label': 'SampleClass.bar',
      'line': 107,
      'method_name': 'ekxxjcmykd',
      'singleton_method': 1,
      'thread_id': 'dnbsfwyvcs'},
    {
      'file': 'nlnlsnbaui',
      'label': 'SampleClass.bar',
      'line': 535,
      'method_name': 'gjjxpikksc',
      'singleton_method': 0,
      'thread_id': 'dnbsfwyvcs'},
    {
      'file': 'wxvulkduny',
      'label': 'SampleClass#baz',
      'line': 871,
      'method_name': 'dnbsfwyvcs',
      'singleton_method': 0,
      'thread_id': 'fehgefhvzr'},
    {
      'file': 'wxvulkduny',
      'label': '#{obj.inspect}.zab',
      'line': 185,
      'method_name': 'nlnlsnbaui',
      'singleton_method': 0,
      'thread_id': 'nlnlsnbaui'}
  ],
  'timestamp': 1435676442933.332
}

@emilsoman
Copy link
Contributor Author

@ishankhare07 You get something that looks like this :

{
  0=>9,
  1=>1436249360439.0,
  2=>
  [
    {12=>"find_many_square_roots", 13=>"block in RSpec::ExampleGroups::CPUSampling#find_many_square_roots", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>16, 14=>0, 15=>70227174228480},
    {12=>"find_many_square_roots", 13=>"RSpec::ExampleGroups::CPUSampling#find_many_square_roots", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>15, 14=>0, 15=>70227174228480},
    {12=>nil, 13=>"block (4 levels) in <top (required)>", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>58, 14=>0, 15=>70227174228480},
    {12=>"zab", 13=>"#<SampleClassForTest::Sample2:0x007fbe13829140>.zab", 6=>"(eval)", 7=>1, 14=>1, 15=>70227174228480},
    {12=>"baz", 13=>"SampleClassForTest::Sample2#baz", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>21, 14=>0, 15=>70227174228480},
    {12=>"bar", 13=>"SampleClassForTest.bar", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>27, 14=>1, 15=>70227174228480},
    {12=>"foo", 13=>"SampleClassForTest#foo", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>31, 14=>0, 15=>70227174228480},
    {12=>nil, 13=>"block (3 levels) in <top (required)>", 6=>"/Users/emil/OpenSource/rbkit/spec/cpu_sampling_spec.rb", 7=>40, 14=>0, 15=>70227174228480}
  ]
}

@emilsoman
Copy link
Contributor Author

We've finalized the format as above. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants