Skip to content

[CI] AzureStorageCleanupThirdPartyTests testCreateSnapshot failing #73559

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hendrikmuhs opened this issue May 31, 2021 · 6 comments
Closed

[CI] AzureStorageCleanupThirdPartyTests testCreateSnapshot failing #73559

hendrikmuhs opened this issue May 31, 2021 · 6 comments
Assignees
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. >test-failure Triaged test failures from CI

Comments

@hendrikmuhs
Copy link

related to #73493?

Build scan:
https://gradle-enterprise.elastic.co/s/6nbvltqouby7w/tests/:plugins:repository-azure:azureThirdPartyTest/org.elasticsearch.repositories.azure.AzureStorageCleanupThirdPartyTests/testCreateSnapshot

Reproduction line:
./gradlew ':plugins:repository-azure:azureThirdPartyTest' --tests "org.elasticsearch.repositories.azure.AzureStorageCleanupThirdPartyTests.testCreateSnapshot" -Dtests.seed=5626FE423DAAFE85 -Dtests.locale=ar-SD -Dtests.timezone=America/La_Paz -Druntime.java=8

Applicable branches:
7.x

Reproduces locally?:
No

Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.repositories.azure.AzureStorageCleanupThirdPartyTests&tests.test=testCreateSnapshot

Failure excerpt:

org.elasticsearch.repositories.RepositoryVerificationException: [test-repo] path [7.x_third_party_tests_5626FE423DAAFE85] is not accessible on master node


  Caused by: com.azure.storage.blob.models.BlobStorageException: If you are using a StorageSharedKeyCredential, and the server returned an error message that says 'Signature did not match', you can compare the string to sign with the one generated by the SDK. To log the string to sign, pass in the context key value pair 'Azure-Storage-Log-String-To-Sign': true to the appropriate method call.
  If you are using a SAS token, and the server returned an error message that says 'Signature did not match', you can compare the string to sign with the one generated by the SDK. To log the string to sign, pass in the context key value pair 'Azure-Storage-Log-String-To-Sign': true to the appropriate generateSas method call.
  Please remember to disable 'Azure-Storage-Log-String-To-Sign' before going to production as this string can potentially contain PII.
  Status code 403, "<?xml version="1.0" encoding="utf-8"?><Error><Code>AuthenticationFailed</Code><Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
  RequestId:3e791877-001e-0129-5d13-568a16000000
  Time:2021-05-31T11:52:25.3848101Z</Message><AuthenticationErrorDetail>Signature did not match. String to sign used was rwdl
  
  2021-07-20T13:21:00Z
  /blob/elasticsearchcithird/elasticsearch-ci-thirdparty-sas
  
  
  
  2018-11-09
  c
  
  
  
  
  
  </AuthenticationErrorDetail></Error>"

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(NativeConstructorAccessorImpl.java:-2)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at com.azure.core.http.rest.RestProxy.instantiateUnexpectedException(RestProxy.java:343)
    at com.azure.core.http.rest.RestProxy.lambda$ensureExpectedStatus$5(RestProxy.java:382)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:125)
    at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1815)
    at reactor.core.publisher.MonoCacheTime$CoordinatorSubscriber.signalCached(MonoCacheTime.java:337)
    at reactor.core.publisher.MonoCacheTime$CoordinatorSubscriber.onNext(MonoCacheTime.java:354)
    at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2397)
    at reactor.core.publisher.MonoCacheTime$CoordinatorSubscriber.onSubscribe(MonoCacheTime.java:293)
    at reactor.core.publisher.FluxFlatMap.trySubscribeScalarMap(FluxFlatMap.java:192)
    at reactor.core.publisher.MonoFlatMap.subscribeOrReturn(MonoFlatMap.java:53)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:57)
    at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
    at reactor.core.publisher.MonoCacheTime.subscribeOrReturn(MonoCacheTime.java:143)
    at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:57)
    at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:157)
    at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onNext(FluxDoFinally.java:130)
    at reactor.core.publisher.FluxHandle$HandleSubscriber.onNext(FluxHandle.java:118)
    at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onNext(FluxMap.java:220)
    at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onNext(FluxDoFinally.java:130)
    at reactor.core.publisher.FluxHandleFuseable$HandleFuseableSubscriber.onNext(FluxHandleFuseable.java:184)
    at reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.onNext(FluxContextWrite.java:107)
    at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1815)
    at reactor.core.publisher.MonoCollectList$MonoCollectListSubscriber.onComplete(MonoCollectList.java:128)
    at reactor.core.publisher.FluxPeek$PeekSubscriber.onComplete(FluxPeek.java:259)
    at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:142)
    at reactor.netty.channel.FluxReceive.onInboundComplete(FluxReceive.java:401)
    at reactor.netty.channel.ChannelOperations.onInboundComplete(ChannelOperations.java:416)
    at reactor.netty.channel.ChannelOperations.terminate(ChannelOperations.java:470)
    at reactor.netty.http.client.HttpClientOperations.onInboundNext(HttpClientOperations.java:685)
    at reactor.netty.channel.ChannelOperationsHandler.channelRead(ChannelOperationsHandler.java:94)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
    at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1504)
    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1253)
    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1300)
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508)
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:620)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:583)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at org.elasticsearch.repositories.azure.SocketAccess.lambda$doPrivilegedVoidException$0(SocketAccess.java:46)
    at java.security.AccessController.doPrivileged(AccessController.java:-2)
    at org.elasticsearch.repositories.azure.SocketAccess.doPrivilegedVoidException(SocketAccess.java:45)
    at org.elasticsearch.repositories.azure.executors.PrivilegedExecutor.lambda$execute$0(PrivilegedExecutor.java:27)
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:673)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

@hendrikmuhs hendrikmuhs added :Core/Infra/Core Core issues without another label >test-failure Triaged test failures from CI labels May 31, 2021
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label May 31, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@fcofdez fcofdez self-assigned this May 31, 2021
@gwbrown gwbrown added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs and removed :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team labels Jun 2, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jun 2, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@fcofdez
Copy link
Contributor

fcofdez commented Jun 2, 2021

This seems to be a credentials problem, I'm able to reproduce the issue locally using an invalid SAS token signature. Providing a valid SAS token the test pass. I'm wondering if the latest version we're using is having troubles with SAS tokens that use 2018-11-09 API version. I'll dig into that.

@fcofdez
Copy link
Contributor

fcofdez commented Jun 3, 2021

It seems like another bug on the Azure SDK SAS tokens handling, I've opened Azure/azure-sdk-for-java#22042. I'm not sure if we should rollback the SDK upgrade as this prevents using the azure repository in certain scenarios. cc @rjernst @henningandersen

@benwtrent
Copy link
Member

Another bunch of auth failures due to SAS token:

https://gradle-enterprise.elastic.co/s/rbyoqfowkfmkg

All in 7.x

@fcofdez
Copy link
Contributor

fcofdez commented Jun 8, 2021

I'm closing this issue as it should be solved by (#73837)

@fcofdez fcofdez closed this as completed Jun 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

5 participants