Skip to content

Feat/poc continuous profiling #4556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 21 commits into
base: main
Choose a base branch
from
Draft

Conversation

lbloder
Copy link
Collaborator

@lbloder lbloder commented Jul 15, 2025

📜 Description

💡 Motivation and Context

Initial implementation of #2635

💚 How did you test it?

📝 Checklist

  • I added tests to verify the changes.
  • No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled.
  • I updated the docs if needed.
  • I updated the wizard if needed.
  • Review from the native team if needed.
  • No breaking change or entry added to the changelog.
  • No breaking change for hybrid SDKs or communicated to hybrid SDKs.

🔮 Next steps

lbloder added 21 commits April 25, 2025 12:14
…sed on jfr converter bundled with asyncprofiler
… use existing SentryStackFrame instead of JfrFrame,
…t in SentrySpan to work around scientific notation of double, use wall clock profiling
# Conflicts:
#	sentry/build.gradle.kts
#	sentry/src/test/java/io/sentry/ExternalOptionsTest.kt
#	sentry/src/test/java/io/sentry/JsonSerializerTest.kt
#	sentry/src/test/java/io/sentry/SentryClientTest.kt
#	sentry/src/test/java/io/sentry/SentryOptionsTest.kt
Copy link
Contributor

Fails
🚫 Please consider adding a changelog entry for the next release.
Messages
📖 Do not forget to update Sentry-docs with your feature once the pull request gets approved.

Instructions and example for changelog

Please add an entry to CHANGELOG.md to the "Unreleased" section. Make sure the entry includes this PR's number.

Example:

## Unreleased

- Feat/poc continuous profiling ([#4556](https://github.com/getsentry/sentry-java/pull/4556))

If none of the above apply, you can opt out of this check by adding #skip-changelog to the PR description.

Generated by 🚫 dangerJS against 03a20dd

Copy link
Contributor

Performance metrics 🚀

  Plain With Sentry Diff
Startup time 447.81 ms 457.62 ms 9.80 ms
Size 1.58 MiB 2.09 MiB 520.14 KiB

Comment on lines +124 to +133
long divisor = jfr.ticksPerSec / 1000_000_000L;
long myTimeStamp =
jfr.chunkStartNanos + ((event.time - jfr.chunkStartTicks) / divisor);

JfrSample sample = new JfrSample();
Instant instant = Instant.ofEpochSecond(0, myTimeStamp);
double timestampDouble =
instant.getEpochSecond() + instant.getNano() / 1_000_000_000.0;

sample.timestamp = timestampDouble;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revisiting this in follow-up PR, I think the way the timestamp is calculated is not correct

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworked in follow-up (#4576)

@adinauer
Copy link
Member

@sentry review

Comment on lines +297 to +310

if (profileChunk.getPlatform().equals("java")) {
final IProfileConverter profileConverter =
ProfilingServiceLoader.loadProfileConverter();
if (profileConverter != null) {
try {
final SentryProfile profile =
profileConverter.convertFromFile(traceFile.toPath());
profileChunk.setSentryProfile(profile);
} catch (IOException e) {
throw new SentryEnvelopeException("Profile conversion failed");
}
}
} else {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The profile conversion logic should have proper error handling and fallback. If conversion fails, the current implementation throws an exception, but it might be better to log the error and continue with a degraded experience.

Did we get this right? 👍 / 👎 to inform future reviews.

Comment on lines +36 to +43
return NoOpContinuousProfiler.getInstance();
} catch (Throwable t) {
logger.log(
SentryLevel.ERROR,
"Failed to load continuous profiler provider, using NoOpContinuousProfiler",
t);
return NoOpContinuousProfiler.getInstance();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service loader methods silently catch all Throwable and either return null or a no-op implementation. This could hide important configuration or setup errors. Consider logging the errors at appropriate levels and being more specific about which exceptions to catch.

Did we get this right? 👍 / 👎 to inform future reviews.

Comment on lines +190 to +210
startProfileChunkTimestamp = new SentryNanotimeDate();
}
filename = profilingTracesDirPath + File.separator + SentryUUID.generateSentryId() + ".jfr";
String startData = null;
try {
final String profilingIntervalMicros =
String.format("%dus", (int) SECONDS.toMicros(1) / profilingTracesHz);
final String command =
String.format(
"start,jfr,event=wall,interval=%s,file=%s", profilingIntervalMicros, filename);
System.out.println(command);
startData = profiler.execute(command);
} catch (Exception e) {
logger.log(SentryLevel.ERROR, "Failed to start profiling: ", e);
}
// check if profiling started
if (startData == null) {
return;
}

isRunning = true;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The profiler initialization and AsyncProfiler command construction should have more robust error handling. The current implementation may not handle all edge cases properly.

Did we get this right? 👍 / 👎 to inform future reviews.

@@ -578,6 +578,8 @@ public class SentryOptions {

private @NotNull ISocketTagger socketTagger = NoOpSocketTagger.getInstance();

private @Nullable String profilingTracesDirPath;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m Can we simply default this to use File.createTempFile or similar so all that's needed is the sample rate for profiling to work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in follow-up (#4576)

sentry.enablePrettySerializationOutput=false
in-app-includes="io.sentry.samples"
sentry.logs.enabled=true
sentry.profile-session-sample-rate=1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this also need the path set?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dependency on async profiler also seems to be missing from gradle build file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in follow-up (#4576)

Copy link
Member

@lcian lcian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also document the commit SHA of async-profiler we're vendoring?
Perhaps as a README.md in sentry-async-profiler.
Might come in handy for troubleshooting/upgrades in the future and could be hard to find the exact commit later.

We could also add the vendored directory path to the ignored ones in our codecov.yml so that it won't impact coverage stats.

if (base64Trace.isEmpty()) {
throw new SentryEnvelopeException("Profiling trace file is empty");

if (profileChunk.getPlatform().equals("java")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can make it into an enum instead of string


public int stackId;

public @Nullable String threadId;
Copy link
Member

@lcian lcian Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is documented as required in the dev docs, perhaps we can provide a default value if it's missing, even though I'm not sure if that makes much sense

@lbloder lbloder mentioned this pull request Jul 29, 2025
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants