Skip to content

Panics through FFI in error callbacks #228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
elichai opened this issue Aug 19, 2020 · 9 comments · Fixed by #290
Closed

Panics through FFI in error callbacks #228

elichai opened this issue Aug 19, 2020 · 9 comments · Fixed by #290

Comments

@elichai
Copy link
Member

elichai commented Aug 19, 2020

Right now we override the error callbacks at compile time with:

pub unsafe extern "C" fn rustsecp256k1_v0_1_1_default_illegal_callback_fn(message: *const c_char, _data: *mut c_void) {
use core::str;
let msg_slice = slice::from_raw_parts(message as *const u8, strlen(message));
let msg = str::from_utf8_unchecked(msg_slice);
panic!("[libsecp256k1] illegal argument. {}", msg);
}

pub unsafe extern "C" fn rustsecp256k1_v0_1_1_default_error_callback_fn(message: *const c_char, _data: *mut c_void) {
use core::str;
let msg_slice = slice::from_raw_parts(message as *const u8, strlen(message));
let msg = str::from_utf8_unchecked(msg_slice);
panic!("[libsecp256k1] internal consistency check failed {}", msg);
}

But this is sadly UB if the panic strategy is unwind(the default):

Calling a function with the wrong call ABI or unwinding from a function with the wrong unwind ABI.
https://doc.rust-lang.org/reference/behavior-considered-undefined.html

You must absolutely catch any panics at the FFI boundary
https://doc.rust-lang.org/nomicon/unwinding.html

rust-lang/rust#52652

Now the easiest solution is to replace the panic with abort, but then we have a few problems:

  1. We lose the printing, the user just gets a Aborted (core dumped) or Illegal instruction (core dumped) which makes it very hard to pinpoint (We could conditionally print if std is on, but that sucks).
  2. Which abort do we call?
    2.a. The libc one?(adds the assumption of libc which might break for wasm).
    2.b. The intrinsic one? (requires nightly).
    2.c. std::process::abort? requires libstd.
    2.d. Stabilize intrinsics::abort in core rust-lang/rfcs#2512

Another potential solutions:

  1. Advise users to use panic=abort (imo this isn't a good solution).
  2. Conditionally use std::panic::catch_unwind which will catch the panic if std is on but will still print the output and/or stacktrace.
  3. Wait for "C unwind" which allows unwinding the stack through C. (https://github.com/rust-lang/project-ffi-unwind/blob/master/rfcs/0000-c-unwind-abi.md)
  4. Wait for Add #[cfg(panic = '...')] rust-lang/rust#74754 which allows to condition on if panic=abort or not.
  5. Dive deep into the machinery of panic and see if we can still trigger printing but abort instead, probably won't be possible in stable.
  6. Propose new language features to fix this(i.e. abort!(...)), but even if they'll get accepted we won't use that release for a long time.

(Thanks to @joshtriplett for some of the suggestions)

@elichai elichai changed the title panic through FFI in error callbacks Panics through FFI in error callbacks Aug 19, 2020
@apoelstra
Copy link
Member

Lol wow, I dunno how we didn't see this. I think we may have done this in an attempt to get away from libc abort for WASM reasons.

As for printing, we can just use eprintln!. But it seems like there are no good ways to abort the process.

Another alternative is to not abort the process, just return zero from the callback. This will result in failure being returned from the underlying C function, and the user will either see a Rust error or a debug_assert! from a supposedly-infallible function returning an error. I think this may be the best solution, especially since these error callbacks should never ever trigger in practice.

@elichai
Copy link
Member Author

elichai commented Aug 28, 2020

Lol wow, I dunno how we didn't see this. I think we may have done this in an attempt to get away from libc abort for WASM reasons.

I honestly have no idea .

As for printing, we can just use eprintln!.

We can't do that without std (we could conditionally print if std is available, but if a user of no-std does not have a panic_handler with printing he'll now lose that)

Another alternative is to not abort the process, just return zero from the cal
lback. This will result in failure being returned from the underlying C function, and the user will either see a Rust error or a debug_assert! from a supposedly-infallible function returning an error. I think this may be the best solution, especially since these error callbacks should never ever trigger in practice.

Hmm I did not think of that, are we sure this is also correct usage for the error callback?

After this callback returns, anything may happen, including crashing

https://github.com/bitcoin-core/secp256k1/blob/master/include/secp256k1.h#L288

@real-or-random
Copy link
Collaborator

This will result in failure being returned from the underlying C function

I don't think so, upstream assumes the callbacks will never return. I think @elichai is right. The point of the callbacks is that they never return because this would mean continuing parts of the program.

@apoelstra
Copy link
Member

Every single ARG_CHECK upstream has a test checking that it results in the function returning 0. I'm surprised this isn't documented or supported given how much effort we've put into testing it.

@dgpv
Copy link

dgpv commented Aug 28, 2020

I had incorrect understanding that callback functions can return. It makes sense that error callback should not return, because everything can happen, but it seems to me that returning from invalid callback should be OK, because it is about incorrect parameters, and supplying incorrect parameters is much more probable event than some fatal problems where error callback is called, and incorrect parameters is not a situation where just duying is what programmer would expect. Or is the abort on illegal parameters has the role of an insurance against people not checking results returned from the functions ?

@real-or-random
Copy link
Collaborator

real-or-random commented Aug 30, 2020

Every single ARG_CHECK upstream has a test checking that it results in the function returning 0. I'm surprised this isn't documented or supported given how much effort we've put into testing it.

Ok you had ARG_CHECK (illegal argument callback) in mind and I had CHECK (error callback) in mind.

I think the error callback must crash, what cleaner thing can you do if you reached a assumed-to-be-impossible state?

And upstream says for the illegal argument callback:

The philosophy is that these shouldn't be dealt with through a specific return value, as calling code should not have branches to deal with the case that this code itself is broken.

https://github.com/bitcoin-core/secp256k1/blob/master/include/secp256k1.h#L242

If we pass the user a Rust errors instead, we violate this philosophy. edit: Okay, yeah, if we guarantee that we never hit a ARG_CHECK, that may be fine but it does not really help because we need to deal with the error callbacks anyway.

Note, there's even ARG_CHECK_NO_RETURN (https://github.com/bitcoin-core/secp256k1/blob/670cdd3f8be25f81472b2d16dcd228b0d24a5c45/src/secp256k1.c#L35) but we can't hit this here.

@apoelstra
Copy link
Member

cc #288.

@real-or-random I don't understand your objection. We have unit tests for every single instance of ARG_CHECK in the upstream codebase with non-crashing callbacks, and these unit tests even check that the result of hitting the ARG_CHECKs is a detectable bad return value.

I think we should just do nothing in the callbacks, and then trust that an assertion failure will fire later on. You cannot hit these assertions without unsafe code anyway.

The current situation, which is UB, is not tenable.

@real-or-random
Copy link
Collaborator

We have unit tests for every single instance of ARG_CHECK in the upstream codebase with non-crashing callbacks, and these unit tests even check that the result of hitting the ARG_CHECKs is a detectable bad return value.

Indeed, ARG_CHECK is probably fine. But what about calls to error_callback? (See git grep secp256k1_callback_call). In scratch_impl, we have functions that don't even have a return value. (Currently these are not used but they hopefully will be in the future).

I don't claim I have a solution, it's just fucked up.

I think we should just do nothing in the callbacks, and then trust that an assertion failure will fire later on. You cannot hit these assertio ns without unsafe code anyway.

The current situation, which is UB, is not tenable.

I agree. Even though a segfault is a nice way of aborting, too. :P But we shouldn't rely on this...

@apoelstra
Copy link
Member

As Elichai has noted elsewhere, I am mistaken and this is not actually UB. In recent version of rustc it will reliably trigger an abort and (empirically) in older rustcs it seems to "just work".

real-or-random added a commit to real-or-random/rust-secp256k1 that referenced this issue Mar 19, 2021
Panicking from C is not UB in newer rust versions and will reliably
trigger an abort (without unwinding). In older rust versions, it is
technically UB but empirically it seems to "just work" (and what should
it realistically do except crashing, which is what we intent).

Since there's potentially no unwinding, we can't test this behavior
using [should_panic]. This PR will check the exit code instead in our
CI tests.

Fixes rust-bitcoin#228.
real-or-random added a commit to real-or-random/rust-secp256k1 that referenced this issue Mar 24, 2021
Panicking from C is not UB in newer rust versions and will reliably
trigger an abort (without unwinding). In older rust versions, it is
technically UB but empirically it seems to "just work" (and what should
it realistically do except crashing, which is what we intent).

Since there's potentially no unwinding, we can't test this behavior
using [should_panic]. This PR will instead check the libtest output
explicitly in our CI tests.

Fixes rust-bitcoin#228.
real-or-random added a commit to real-or-random/rust-secp256k1 that referenced this issue Mar 24, 2021
Panicking from C is not UB in newer rust versions and will reliably
trigger an abort (without unwinding). In older rust versions, it is
technically UB but empirically it seems to "just work" (and what should
it realistically do except crashing, which is what we intent).

Since there's potentially no unwinding, we can't test this behavior
using [should_panic]. This PR will instead check the libtest output
explicitly in our CI tests.

Fixes rust-bitcoin#228.
real-or-random added a commit to real-or-random/rust-secp256k1 that referenced this issue Mar 24, 2021
Panicking from C is not UB in newer rust versions and will reliably
trigger an abort (without unwinding). In older rust versions, it is
technically UB but empirically it seems to "just work" (and what should
it realistically do except crashing, which is what we intent).

Since there's potentially no unwinding, we can't test this behavior
using [should_panic]. This PR will instead check the libtest output
explicitly in our CI tests.

Fixes rust-bitcoin#228.
real-or-random added a commit to real-or-random/rust-secp256k1 that referenced this issue Mar 24, 2021
Panicking from C is not UB in newer rust versions and will reliably
trigger an abort (without unwinding). In older rust versions, it is
technically UB but empirically it seems to "just work" (and what should
it realistically do except crashing, which is what we intent).

Since there's potentially no unwinding, we can't test this behavior
using [should_panic]. This PR will instead check the libtest output
explicitly in our CI tests.

Fixes rust-bitcoin#228.
real-or-random added a commit to real-or-random/rust-secp256k1 that referenced this issue Mar 24, 2021
Panicking from C is not UB in newer rust versions and will reliably
trigger an abort (without unwinding). In older rust versions, it is
technically UB but empirically it seems to "just work" (and what should
it realistically do except crashing, which is what we intent).

Since there's potentially no unwinding, we can't test this behavior
using [should_panic]. This PR will instead check the libtest output
explicitly in our CI tests.

Fixes rust-bitcoin#228.
real-or-random added a commit to real-or-random/rust-secp256k1 that referenced this issue Mar 24, 2021
Panicking from C is not UB in newer rust versions and will reliably
trigger an abort (without unwinding). In older rust versions, it is
technically UB but empirically it seems to "just work" (and what should
it realistically do except crashing, which is what we intent).

Since there's potentially no unwinding, we can't test this behavior
using [should_panic]. This PR will instead check the libtest output
explicitly in our CI tests.

Fixes rust-bitcoin#228.
real-or-random added a commit to real-or-random/rust-secp256k1 that referenced this issue Mar 24, 2021
Panicking from C is not UB in newer rust versions and will reliably
trigger an abort (without unwinding). In older rust versions, it is
technically UB but empirically it seems to "just work" (and what should
it realistically do except crashing, which is what we intent).

Since there's potentially no unwinding, we can't test this behavior
using [should_panic]. This PR will instead check the libtest output
explicitly in our CI tests.

Fixes rust-bitcoin#228.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants