Skip to content

Add Java support #94

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wusphinx opened this issue Feb 23, 2021 · 9 comments
Closed

Add Java support #94

wusphinx opened this issue Feb 23, 2021 · 9 comments
Labels

Comments

@wusphinx
Copy link
Contributor

Found this open source project that may help

@chadbrewbaker
Copy link

Just use the eBPF mode and bcc tools for java scripts. https://github.com/iovisor/bcc/tree/master/tools

@schrepfler
Copy link

Also this https://github.com/jvm-profiling-tools/async-profiler (also has converters, JFR to Flame Graph, JFR to FlameScope, collapsed stacks to Flame Graph)

@ivanyu
Copy link

ivanyu commented May 16, 2021

Hi,

I've been doing some experiments with async-profiler and it preliminary seems possible to use its slightly modified Java agent library to provide stacktraces in some form back to the Pyroscope agent.

I'm planning to continue these experiments and would be happy to contribute to Pyroscope.

@petethepig
Copy link
Member

@ivanyu Hi Ivan, this sounds very promising. I would love to talk in more detail about the integration. I reached out over email. Looking forward to talking to you.

@petethepig
Copy link
Member

Talked to Ivan yesterday. We decided to go with async-profiler.

We're going to distribute the java side of the integration as a jar file that people will be able to add to their applications as a Java agent, here's an example from a blog post Ivan wrote:

java -javaagent:some/path/agent-1.0.0-SNAPSHOT.jar -jar the_application.jar

There are a couple of options when it comes to integrating with this on pyroscope side:

  • Option A: pyroscope deals with chunking, HTTP queue handling and communicates with Java over a UNIX socket or something like that. We could still use pyroscope exec.

  • Option B: Java agent deals with everything.

Option A pros / cons

  • + less code duplication
  • + unified way of integrating with client applications via pyroscope exec
  • we'd have to implement some communication protocol, which would take time and add complexity
  • performance-wise it's gonna be worse due to communication overhead
  • some users find pyroscope exec confusing as it's not something people are particularly familiar with, so we might not support this long term and go some route where people add our integrations via pip / bundler / whatever else is more common for each particular ecosystem

Option B pros / cons

pros / cons are pretty much the opposite of pros and cons of option A


I also made a little diagram illustrating the difference:
Page1 9


Which way we should go?

Based on this list of pros and cons I lean towards option B. @ivanyu @kolesnikovae Would love to hear what you think about these options and this plan in general.

@ivanyu
Copy link

ivanyu commented May 30, 2021

I'm inclined towards Option B mainly for:

  • its simplicity (less moving parts);
  • being well-understood in the Java performance and tracing world (e.g. Datadog, Dynatrace, Elastic APM).

@chadbrewbaker
Copy link

chadbrewbaker commented May 31, 2021

Agree that virtual heap annotation with standalone executables is best. On Postgres for instance you may want to profile by SQL statement execution.

You want both the virtual and eBPF traces. On one box you could be running Ruby, Go, and two JVMs all at once. Pyroscope should be able to detect if is a PID is running an application with a known virtual annotator. The detector needs to be customizable; binary path, crunching the command line arguments, analysis of the ELF/machO header ... all might be used to fingerprint.

The logging format needs to be standardized into a sqlite file format with binary blobs where appropriate.

Memory tracing is the big thing that few are doing. Dumping 16GB RAM for analysis isn't feasible. It is feasible to dump /usr/bin/time -v counters, /proc/PID/memory_maps, and have the ability to run OS pages through the zram compressor (usually zstd) and know how compressible they are. When you have extremely compressible pages you know memory isn't being used efficiently. I'll have to dig into the Linux and XNU kernels to see how to get OS page staleness.

Generic executables profiled with eBPF need the debug symbols processed. Prodfiler is doing this cloud side by uploading the entire binary - this is a PITA for code that isn't a binary from a major Linux distro. There is innovation here for keeping analysis on localhost for "cargo test" instead of uploading the binary every local build.

OpenTracing is becoming a standard. There should be some way of (PID, timestamp) logging on an incoming trace so you can pull all stacks corresponding to a particular incoming API call.

@wenhuwang
Copy link

hello, I found in the documentation that the eBPF agent supports the Java language. It requires perf-map-agent. Is there any more detailed documentation?

@korniltsev
Copy link
Collaborator

korniltsev commented Nov 28, 2023

The docs explaining how to enable perf map would be here
https://github.com/jvm-profiling-tools/perf-map-agent

Although the original pyrscope supported perf map symbols, the grfafana-agent pyroscope.ebpf does not support it just yet
Ive created an issue to track this #2766
It should be quick and easy to implement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants