Skip to content

OutOfMemory/Slow performance upon service creation/deletion #1096

@martin-d-aleksandrov

Description

@martin-d-aleksandrov

Hi Friends,

we recently noticed a strange behaviour when creating/deleting service instances using cf-java-client that leads to OutOfMemoryException and really slow performance. The issue occurs upon unsuccessful service creation/deletion, when it takes about 20 seconds to build the underlying reactor stacktrace. We confirmed that the stacktrace building is the time consuming part of the execution via heapdumps, threadumps and profiling of the app. The heapdump shows the impressive 150MB char array containing the error message and stacktrace. The truncated stacktrace has the following structure:

reactor.core.publisher.Sinks$EmissionException: Spec. Rule 1.3 - onSubscribe, onNext, onError and onComplete signaled to a Subscriber MUST be signaled serially.
	at reactor.core.publisher.InternalManySink.emitNext(InternalManySink.java:56)
	Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException: 
Assembly trace from producer [reactor.core.publisher.MonoCacheTime] :
	reactor.core.publisher.Mono.checkpoint(Mono.java:1902)
	org.cloudfoundry.reactor.tokenprovider.AbstractUaaTokenProvider.token(AbstractUaaTokenProvider.java:261)
Error has been observed at the following site(s):
	|_ Mono.checkpoint ⇢ at org.cloudfoundry.reactor.tokenprovider.AbstractUaaTokenProvider.token(AbstractUaaTokenProvider.java:261)
	|_ Mono.checkpoint ⇢ at org.cloudfoundry.reactor.client.v2.spaces.ReactorSpaces.listServices(ReactorSpaces.java:232)
	|_ Mono.checkpoint ⇢ at org.cloudfoundry.operations.services.DefaultServices.createInstance(DefaultServices.java:176)
	|_____ Mono.checkpoint ⇢ at org.cloudfoundry.reactor.tokenprovider.AbstractUaaTokenProvider.token(AbstractUaaTokenProvider.java:261)
	|_ Mono.checkpoint ⇢ at org.cloudfoundry.reactor.client.v2.spaces.ReactorSpaces.listServices(ReactorSpaces.java:232)
	|_ Mono.checkpoint ⇢ at org.cloudfoundry.operations.services.DefaultServices.createInstance(DefaultServices.java:176)
	|_________ Mono.checkpoint ⇢ at org.cloudfoundry.reactor.tokenprovider.AbstractUaaTokenProvider.token(AbstractUaaTokenProvider.java:261)
	|_ Mono.checkpoint ⇢ at org.cloudfoundry.reactor.client.v2.spaces.ReactorSpaces.listServices(ReactorSpaces.java:232)
	|_ Mono.checkpoint ⇢ at org.cloudfoundry.operations.services.DefaultServices.createInstance(DefaultServices.java:176)
	|_____________ Mono.checkpoint ⇢ at org.cloudfoundry.reactor.tokenprovider.AbstractUaaTokenProvider.token(AbstractUaaTokenProvider.java:261)
	|_ Mono.checkpoint ⇢ at org.cloudfoundry.reactor.client.v2.spaces.ReactorSpaces.listServiceInstances(ReactorSpaces.java:226)
	|_ Mono.checkpoint ⇢ at org.cloudfoundry.operations.services.DefaultServices.deleteInstance(DefaultServices.java:215)
	|_________________ Mono.checkpoint ⇢ at org.cloudfoundry.reactor.tokenprovider.AbstractUaaTokenProvider.token(AbstractUaaTokenProvider.java:261)
	|_ Mono.checkpoint ⇢ at org.cloudfoundry.reactor.client.v2.spaces.ReactorSpaces.listServiceInstances(ReactorSpaces.java:226)
	|_ Mono.checkpoint ⇢ at org.cloudfoundry.operations.services.DefaultServices.deleteInstance(DefaultServices.java:215)

         ... <much more data that look identically (a couple of tenths of thousands of lines>

Stack trace:
		at reactor.core.publisher.InternalManySink.emitNext(InternalManySink.java:56)
		at org.cloudfoundry.reactor.tokenprovider.AbstractUaaTokenProvider.lambda$null$3(AbstractUaaTokenProvider.java:209)
		at java.util.Optional.ifPresent(Optional.java:159)
		at org.cloudfoundry.reactor.tokenprovider.AbstractUaaTokenProvider.lambda$extractRefreshToken$4(AbstractUaaTokenProvider.java:198)
		at reactor.core.publisher.FluxPeek$PeekSubscriber.onNext(FluxPeek.java:184)
		at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:120)
		at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:120)
		at reactor.core.publisher.FluxHandle$HandleSubscriber.onNext(FluxHandle.java:118)
		at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onNext(FluxMap.java:220)
		at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onNext(FluxDoFinally.java:130)
		at reactor.core.publisher.FluxHandleFuseable$HandleFuseableSubscriber.onNext(FluxHandleFuseable.java:184)
		at reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.onNext(FluxContextWrite.java:107)
		at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1789)
		at reactor.core.publisher.MonoCollectList$MonoCollectListSubscriber.onComplete(MonoCollectList.java:128)
		at reactor.core.publisher.FluxPeek$PeekSubscriber.onComplete(FluxPeek.java:259)
		at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:142)
		at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onComplete(FluxDoFinally.java:145)
		at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:142)
		at reactor.netty.channel.FluxReceive.onInboundComplete(FluxReceive.java:389)
		at reactor.netty.channel.ChannelOperations.onInboundComplete(ChannelOperations.java:396)
		at reactor.netty.channel.ChannelOperations.terminate(ChannelOperations.java:452)
		at reactor.netty.http.client.HttpClientOperations.onInboundNext(HttpClientOperations.java:664)
		at reactor.netty.channel.ChannelOperationsHandler.channelRead(ChannelOperationsHandler.java:94)
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
		at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
		at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
		at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
		at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
		at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
		at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
		at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
		at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
		at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1533)
		at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1282)
		at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1329)
		at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508)
		at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447)
		at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
		at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
		at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
		at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
		at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795)
		at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480)
		at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
		at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
		at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
		at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
		at java.lang.Thread.run(Thread.java:836)

Suppressed: java.lang.Exception: #block terminated with an error
		at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99)
		at reactor.core.publisher.Mono.block(Mono.java:1703)
                 ... <more suppressed exceptions (about 4000) that are exactly the same>

Digging into the cf-java-client/reactor code, we noticed that the reactor provides a functionality for creating a light checkpoint that has no the overhead of stacktrace building upon error, but the wrapping cf-java-client does not expose that. Do you plan to expose such functionality? Or have you observed similar behaviour, any idea how we could overcome this?

We use version 5.1.0 of cf-java-client.

Regards,
Martin

PS. Having a quick overview of the reported issues, I noticed this one - #1090 - which might be related to this problem, as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementImprovement or new functionalityreactortriagedInitial triage of issue has been performed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions