Open
Description
Hello, I've observed a sigsegv under the described conditions: an error path which does not clear the current job/tlv.
I added some instrumentation to my application with the intel qat engine:
// rsa decrypt error
h2o[151965]: DEBUG [bssl_qat_async_start_job:479] saving job ctx:0x6030002489b8 job:0x6060001e3208 waitctx:0x6060001e3268
h2o[151965]: DEBUG [bssl_offload_create_request:1362] nb job:0x6030002489b8
h2o[151965]: [/build/deps/neverbleed/neverbleed.c] bssl_offload_decrypt:RSA decrypt failure:-:error:04000070:RSA routines:OPENSSL_internal:DATA_LEN_NOT_EQUAL_TO_MOD_LEN ossl err qat_hw_rsa.c 1664
h2o[151965]: [openssl-privsep] RSA decrypt failure
h2o[151965]: DEBUG [bssl_offload_free_request:698] nb finish job:0x6030002489b8
h2o[151965]: DEBUG [bssl_qat_async_finish_job:500] clearing ctx:0x6030002489b8 job:0x6060001e3208 waitctx:0x6060001e3268
// rsa sign
h2o[151965]: DEBUG [bssl_qat_async_start_job:479] saving job ctx:0x6030002489e8 job:0x6060001e32c8 waitctx:0x6060001e3328
h2o[151965]: DEBUG [bssl_offload_create_request:1362] nb job:0x6030002489e8
h2o[151965]: DEBUG [bssl_offload_digestsign:1418] rsa:0x611000059c88
h2o[151965]: DEBUG [qat_rsa_priv_enc:1021] here
h2o[151965]: DEBUG [qat_rsa_priv_enc:1106] here error
h2o[151965]: DEBUG [qat_rsa_priv_sign:1635] len:0, ASYNC_get_current_job(): 0x6060001e3208, (nil)
h2o[151965]: received SIGSEGV: si_code=1 (SEGV_MAPERR: Address not mapped to object) si_addr=(nil)
h2o[151965]: received SIGSEGV (cont): rip=(nil) rsp=0x7fe1bab6b688 rbp=(nil)
We can see that ASYNC_get_current_job(): 0x6060001e3208
which is the previous job, the failed rsa decryption.
Here's how the API is being used at a high level which causes the issue
// rsa decrypt
async_ctx = bssl_qat_async_start_job()
bssl_qat_async_save_current_job // --> set's the current job
meth = bssl_engine_get_rsa_method();
meth->decrypt(...)
// in_len != rsa_size --> OPENSSL_PUT_ERROR, return error
bssl_qat_async_finish_job(async_ctx)
// note: the current job is still set here...
// rsa sign
async_ctx = bssl_qat_async_start_job()
meth = bssl_engine_get_rsa_method();
meth->sign_raw(...)
qat_rsa_priv_sign
qat_rsa_priv_enc
qat_rsa_decrypt
qat_init_op_done
job = ASYNC_get_current_job() // this is the previous job, which has been freed/zero'd
qat_setup_async_event_notification(job)
ASYNC_get_wait_ctx(job) is NULL --> WARN("Could not obtain wait context for job\n");
ASYNC_current_job_last_check_and_get // this is the previous job, which has been freed/zero'd
job->tlv_destructor // --> SIGSEGV
Engine Logs:
[DEBUG][4211829.817191] PID [304707] Thread [7ff7a2c5e700][qat_hw_rsa.c:198:qat_rsa_decrypt()] - Started
[WARN][4211829.817195] PID [304707] Thread [7ff7a2c5e700][qat_events.c:129:qat_setup_async_event_notification()] Could not obtain wait context for job
[WARN][4211829.817200] PID [304707] Thread [7ff7a2c5e700][qat_hw_rsa.c:210:qat_rsa_decrypt()] Failed to setup async event notifications
[WARN][4211829.817206] PID [304707] Thread [7ff7a2c5e700][qat_hw_rsa.c:1017:qat_rsa_priv_enc()] Failure in qat_rsa_decrypt fallback = 0
I can see two potential solutions:
- (diff) one would be to call
_ret = ASYNC_current_job_last_check_and_get();
before those earlier returns - call tlv_destructor (if set) in
bssl_qat_async_finish_job
Any thoughts?
Metadata
Metadata
Assignees
Labels
No labels